Localization/Library: Difference between revisions

From OLPC
Jump to navigation Jump to search
(juegos parajuvar)
(Removing all content from page)
Line 1: Line 1:
== Navigation sidebar ==

''[[/sidebar|sidebar table]]'': {{:Localization/Library/sidebar}}

'''PO format:''' {{:Localization/Library/sidebar po}}

== books & reference ==

=== books ===

'''PO format:''' {{:Localization/Library/books po}}

== Math & Science ==

=== biology ===

''NOTICE: the 'translation' to portuguese is actually an improvisation over Google's service!''
: '''IT NEEDS A SERIOUS REVIEWER.'''
: The spanish translation should also be reviewed by somebody knowledgeable in biology (or rather botany and biomes)

''[[/biology|Biology table]]'': {{:Localization/Library/biology}}

'''Biology PO strings''': {{:Localization/Library/biology po}}



==== Biology XML format ====

Experimental - possible future format for scalability. This XML could be submitted/edited directly by content contributors and automagically parsed into HTML. Note that every article has its own unique ID, and that each child tag within an article has a "lang" attribute to enable people to put in translations. The unique ID being wiki-editable is potentially dangerous - easy to change book IDs, duplicate ID numbers, etc. which is probably an argument in favor for a CMS (much as I hate to say it); this format would be easily transferable to a CMS.

Right now I'm just parsing the title, author, publisher date, and note tags, but more can be added if needed. There needs to be an instance of each tag with the lang attribute set to "default" (for when no language is specified, or a nonexistent translation is requested). Tags with no lang attribute are taken to be set as "default." The last valid version of each tag/lang combination is what is actually parsed. For instance:

<pre>
<title> A title! </title>
<author lang="en">Author1</author>
<author lang="en">Overwrite Author1</author>
</pre>

is the same as

<pre>
<title lang="default"> A title! </title>
<author lang="en">Overwrite Author1</author>
</pre>

Actual sample xml follows. [[User:Mchua|Mchua]] 11:03, 29 March 2007 (EDT)

<pre>
<!-- NOT VALIDATED! -->
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE article SYSTEM "biology.dtd">

<article id="1"> <!-- sample -->
<title language="default">Sample title</title>
<author language="default">Made-up author name</author>
<publisher language="default">Fictional publisher</publisher>
<date language="default">2007</date>
<note language="default">No language specified, or no translation exists in this language</note>
<note language="en">A note in English</note>
<note language="es">A note in Espanol</note>
<note language="pt">A note in Portuguese<note>
</article>

<article id="2">
<title language="default">Tundra</title>
<author language="default"> Frans Lanting </author>
<publisher language="default">E.O. Wilson Foundation</publisher>
<date language="default">2007</date>
<note language="default"> Fall colors, Wrangell-St. Elias National Park, Alaska, USA </note>
<note language="en">Fall colors, Wrangell-St. Elias National Park, Alaska, USA</note>
<note language="es">Colores otoñales, Parque Nacional de Wrangell-St. Elias, Alaska, EE.UU.</note>
<note language="pt">Cores outonales, Parque nacional Wrangell-St. Elias, Alaska, EUA<note>
</article>
</pre>

The advantage to XML is that we can use DTDs to make sure the input format is nice and correct.

<pre>
<!-- NOT VALIDATED -->

<!--
Doctype for OLPC biology library format - intended to be template for future article formats as well.
mchua 3/29/07 - created

The below notes are largely for content contributors who haven't seen DTDs before:
* a library consists of articles with unique IDs
* articles without assigned ID#s get an ID of 0
* title and author are required
* Tags can be marked with language attributes so you can specify what language you're writing in.
check to make sure the value for language you're putting in is valid, or the parser
won't recognize it!
* if no language attribute is specified for child tags, they get a value of "default"
(and will show up when users don't specify a language they want to view in,
or if the language they request does not have a translation yet.)
-->

<!DOCTYPE BIOLOGY [

<!ELEMENT article (title+, author+, publisher*, date*, note*)>
<!ATTLIST article id CDATA "0"> <!-- I would use ID instead of CDATA but ID can't have a default value -->

<!ELEMENT title (#PCDATA)>
<!ATTLIST title language CDATA "default">

<!ELEMENT author (#PCDATA)>
<!ATTLIST title language CDATA "default">

<!ELEMENT publisher (#PCDATA)>
<!ATTLIST title language CDATA "default">

<!ELEMENT date (#PCDATA)>
<!ATTLIST title language CDATA "default">

<!ELEMENT note (#PCDATA)>
<!ATTLIST title language CDATA "default">

]>

<!-- Possible easy way for people to denote licensing for their work?
<!ENTITY license-by-sa "Creative Commons Attribution Share-Alike">
-->
</pre>

==== Alternative XML markup ====

Grouped by language - possibly easier for content contributors & translators, but (I perceive it to be) less flexible in terms of adding tags, languages, etc. on the parse-to-xhtml end. [[User:Mchua|Mchua]] 11:22, 29 March 2007 (EDT)

<pre>
<article id="1">
<default>
<title> Default title </title>
<author> Default author </author>
</default>
<en>
<title> English title </title>
<author> English author </author>
<en>
<es>
<title> Espanol title </title>
<author> Espanol author </author>
<es>
</article>
</pre>

== images & maps ==

=== nature ===

'''PO format:''' {{:Localization/Library/nature po}}

== activities ==

=== Multimedia ===

==== [[Localization/Library/camera po|Camera Activity PO]] ====

If you want to edit, go [[Localization/Library/camera po|here]]<font size="-1"><blockquote>{{:Localization/Library/camera po}}</blockquote></font>

=== Games ===

''[[/games|Games table]]'': {{:Localization/Library/games}}

'''PO format:''' {{:Localization/Library/games po}}

== po2dict - Python helper function ==

Takes pofiles as above and turns them into a python dictionary [[User:Mchua|Mchua]] 13:58, 29 March 2007 (EDT)
<pre>
# po2dict
# feed it the location of a .po file
# returns dictionary version of that file where
# dictionary[MSGINDEXSTRING][LANGUAGECODE] returns the proper string.
# WARNING NOTE: Does not handle multiline strings.
def po2dict(polocation):
pofile = open(polocation, 'r')
translations = {}
currentmsgid = ""
for line in pofile:
if (line[0:5] == "msgid"): # create a new msgid
currentmsgid = line[5:].lstrip(' "').rstrip(' "\n')
translations[currentmsgid] = {}
if (line[4:10] == "msgstr"):
language = line[1:3] # get 2-letter language id
translatedstring = line[10:].lstrip(' "').rstrip(' "\n')
translations[currentmsgid][language] = translatedstring
pofile.close()
return translations

# usage example
dictionary = po2dict{"biology.po")
print dictionary["tiaga-notes"]["es"] # should print the Spanish string ("Lagos y muskegs...")
</pre>


[[category:localization]]

Revision as of 22:43, 7 September 2008