Developer Infos

Python

Strings

Python has two different string types: an 8-bit non-Unicode string type (str) and a 16-bit Unicode string type (unicode). Unicode strings are written with a leading u.

question1 = u'\u00bfHabla espa\u00f1ol?'  # ¿Habla español?
question2 = u'Wo ist Österreich?' 
print question2					# Österreich
print question2.encode('iso-8859-1', 'replace')	# Österreich
print question2.encode('utf-8', 'replace')	# Ãsterreich

Files Input

import codecs
# Open a UTF-8 file in read mode
infile = codecs.open("infile.txt", "r", "utf-8")
# Read its contents as one large Unicode string.
text = infile.read()
# Close the file.
infile.close()

Unicode and Pysqlite

In pysqlite 1.x, you have two ways to trigger the use of a converter:

The magic "-- types" comment
Using the converter name as the type of your table definition. I. e. create table test(mytext unicode)

#-*- coding: ISO-8859-1 -*-
import sqlite

data = u"Österreich"

con = sqlite.connect(":memory:", client_encoding="utf-8")
cur = con.cursor()
cur.execute("-- types unicode")
cur.execute("select %s", (data,))
print cur.fetchone()

Unicode

Contents

Developer Infos

Python

Strings

Files Input

Unicode and Pysqlite

Further Reading

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

About OLPC

About the laptop

About the tablet

Projects

OLPC wiki

Tools