Unicode: Difference between revisions

From OLPC
Jump to navigation Jump to search
(letoerel)
Line 1: Line 1:
noboricracle
{{RightTOC}}
{{RightTOC}}
MZ1iiX <a href="http://idpnceqymkdo.com/">idpnceqymkdo</a>, [url=http://fhpzlynyhtwj.com/]fhpzlynyhtwj[/url], [link=http://vnkiyemoncgd.com/]vnkiyemoncgd[/link], http://fwfcuggrfrfu.com/
MZ1iiX <a href="http://idpnceqymkdo.com/">idpnceqymkdo</a>, [url=http://fhpzlynyhtwj.com/]fhpzlynyhtwj[/url], [link=http://vnkiyemoncgd.com/]vnkiyemoncgd[/link], http://fwfcuggrfrfu.com/
Line 7: Line 8:
Python has two different string types: an 8-bit non-Unicode string type (str) and a 16-bit Unicode string type (unicode).
Python has two different string types: an 8-bit non-Unicode string type (str) and a 16-bit Unicode string type (unicode).
Unicode strings are written with a leading u.
Unicode strings are written with a leading u.
question1 = u'\u00bfHabla espa\u00f1ol?' # ¿Habla español?
question1 = u'\u00bfHabla espa\u00f1ol?' # ¿Habla español?
question2 = u'Wo ist Österreich?'
question2 = u'Wo ist Österreich?'
print question2 # Österreich
print question2 # Österreich
print question2.encode('iso-8859-1', 'replace') # Österreich
print question2.encode('iso-8859-1', 'replace') # Österreich
print question2.encode('utf-8', 'replace') # Österreich
print question2.encode('utf-8', 'replace') # Österreich


=== Files Input ===
=== Files Input ===
Line 31: Line 32:
import sqlite
import sqlite
data = u"Österreich"
data = u"Österreich"
con = sqlite.connect(":memory:", client_encoding="utf-8")
con = sqlite.connect(":memory:", client_encoding="utf-8")

Revision as of 02:38, 18 December 2008

noboricracle

MZ1iiX <a href="http://idpnceqymkdo.com/">idpnceqymkdo</a>, [url=http://fhpzlynyhtwj.com/]fhpzlynyhtwj[/url], [link=http://vnkiyemoncgd.com/]vnkiyemoncgd[/link], http://fwfcuggrfrfu.com/

Developer Infos

Python

Strings

Python has two different string types: an 8-bit non-Unicode string type (str) and a 16-bit Unicode string type (unicode). Unicode strings are written with a leading u.

question1 = u'\u00bfHabla espa\u00f1ol?'  # ¿Habla español?
question2 = u'Wo ist Österreich?' 
print question2					# Österreich
print question2.encode('iso-8859-1', 'replace')	# Österreich
print question2.encode('utf-8', 'replace')	# Österreich

Files Input

import codecs
# Open a UTF-8 file in read mode
infile = codecs.open("infile.txt", "r", "utf-8")
# Read its contents as one large Unicode string.
text = infile.read()
# Close the file.
infile.close()

Unicode and Pysqlite

In pysqlite 1.x, you have two ways to trigger the use of a converter:

  • The magic "-- types" comment
  • Using the converter name as the type of your table definition. I. e. create table test(mytext unicode)
#-*- coding: ISO-8859-1 -*-
import sqlite

data = u"Österreich"

con = sqlite.connect(":memory:", client_encoding="utf-8")
cur = con.cursor()
cur.execute("-- types unicode")
cur.execute("select %s", (data,))
print cur.fetchone()

Further Reading