Localization: Difference between revisions

From OLPC
Jump to navigation Jump to search
m (Reverted edits by 204.169.23.246 (Talk); changed back to last version by RafaelOrtiz)
(Replacing page with '{{OLPC}} {{Translations}}{{TOCright}} Internationalization technology is the technology for representing and composing the languages spoken, taught or used in your countries. '...')
Line 1: Line 1:
{{OLPC}}
{{OLPC}}
{{Translations}}{{TOCright}}
{{Translations}}{{TOCright}}
Internationalization technology is the technology for representing and composing the languages spoken, taught or used in your countries. '''Localization''' is the process of taking software or content and adapting it for local use. It involves fonts, script layout, input methods, speech synthesis, musical instrumentation, collating order, number & date formats, dictionaries, and spelling checkers, among other issues.
Internationalization technology is the technology for representing and composing the languages spoken, taught or used in your countries. '''Localization''' is the process of taking software or content and adapting it for local use. It involves fonts, script layout, input methods, speech synthesis, musical instrumentation, collating order, number

Linux is already more widely localized than Microsoft Windows, but the size of the problem is large.
: To help with translation and localization, see [[Translation]]
: See also [http://en.wikipedia.org/wiki/Internationalization_and_localization Wikipedia's definition] and [http://www.ethnologue.org/ethno_docs/distribution.asp?by=size Ethnologue's list of languages]

''(If you need to localize the keyboard symbols for a laptop, refer to the instructions found on the [[Customizing NAND images#Keyboard]] page of the wiki.)''


== Sugar i18n ==

=== Translating Sugar or dev.laptop.org hosted activities ===

The steps mentioned below apply to those components that make the ''core'' of [[Sugar]] and other [[Activities]] hosted in [http://dev.laptop.org dev.laptop.org]. Other activities may have their own resources (ie: [[Etoys]] using [https://translations.launchpad.net/etoys launchpad.net]) or need to be coordinated with the corresponding developers.

* Go to the [http://translate.fedoraproject.org/releases/olpc OLPC release set] on [http://translate.fedoraproject.org Fedora Translations] to find out which projects needs translation and to download the current translation files (or the POT).
*: ''Please check if the translation hasn't been filed [http://dev.laptop.org/query?status=new&status=assigned&status=reopened&component=localization&order=priority querying Trac] first''
* Translate the <tt>msgstr</tt> (make sure your editor is using '''UTF-8''' encoding);
* Once your translation is ready
*# Open a new [http://dev.laptop.org/newticket Trac ticket], (you will need an account&mdash;[http://dev.laptop.org/login login] or [http://dev.laptop.org/register register])
*#* <tt>Assign to:</tt> ''blank''
*#* <tt>Priority:</tt> normal
*#* '''<tt>Component:</tt> localization'''
*#* '''<tt>Keywords:</tt> ''activity-name'' ''[[ISO 639|language-code]].po'''''
*#* '''<tt>Type:</tt> enhancement'''
*#* <tt>Milestone:</tt> ''as of 2007-09-11 using 'Trial-3''
*#* <tt>Version:</tt> ''blank''
*#* <tt>Cc:</tt> ''an email?''
*#* <tt>Verified:</tt> ''leave blank''
*#* '''<tt>I have files to attach to this ticket:</tt> check this'''
*# '''Attach''' your translation (as a <tt>.po</tt> file <span style="color:red; ">Make sure your browser is set to use '''UTF-8'''.</span>).
* Some time later, the [http://translate.fedoraproject.org/releases/olpc OLPC release set] should be updated.
*: '''NOTE:''' Given the current setup of the workflow, this update is '''not''' immediate, could take a couple of days, depending on the workload of some people involved &mdash; so be patient, we are working on how to make things smoother.

See [https://dev.laptop.org/query?component=localization&type=enhancement&order=priority all tickets] for component ''localization'' and type ''enhancement'' to make sure your addition is scheduled.

For language codes, refer to [[ISO 639]].

=== Keyboarding in your language ===

What good is seeing the interface in a particular language if your keyboard is in another? See [[Customizing NAND images#Keyboard]] on how to configure the keyboard.

== Basic Localization Topics ==

=== Character Sets ===

[http://www.unicode.org/ Unicode] is fully supported in “modern” applications and toolkits used in free software. Legacy character set support also present, but modern applications use Unicode.

Collation order (the text sorting order) is generally well supported in the C library.

: See also: [[:Category:Fonts]], [[Unicode]].

=== Script Layout ===

OLPC uses the [http://www.pango.org/ Pango library], which is able to layout most of the “hard” languages, including: Arabic, the Indic languages, Hebrew, Persian, Thai, etc. It has a modular pluggable layout engine and supports vertical text, as well as supporting bi-directional layout. Overall, some issues remain – but overall Pango can handle most scripts already; if it cannot, modules can be built to handle new scripts as documented in [http://developer.gnome.org/doc/API/2.0/pango/ Pango's reference manual].

: See also: [[:Category:Languages (international)]]

=== Fonts ===

To share content and preserve cultural heritage OLPC's goal must be and is full coverage of all the world's languages. By using the [http://www.fontconfig.org/wiki/ Fontconfig] system Linux has a better concept of language coverage of fonts than other systems. Fontconfig is used to configure the font system and determine what set of fonts are needed to cover a set of languages.

The formats of fonts supported on Linux include [http://en.wikipedia.org/wiki/OpenType OpenType], [http://en.wikipedia.org/wiki/TrueType TrueType] and many others: see [http://www.freetype.org/ Freetype] for details. Most of the font formats supported by Freetype are obsolete, and by far the best results on the screen will be had from OpenType and TrueType format fonts, particularly if they are hinted well. [http://en.wikipedia.org/wiki/Type_1_font Type 1 fonts] are useful primarily for printing; the renderer for Type1 fonts in Freetype we have today is not very good, and Type 1 does not support programmatic hinting for low resolution screens.

The OLPC XO-1 has a high resolution screen. High resolution helps OLPC considerably, particularly in grayscale mode at 200DPI. [http://en.wikipedia.org/wiki/Free_software_Unicode_fonts Wikipedia] as usual, is a starting point for free fonts. "Font foundries" are companies who will contract to produce fonts.

: See also: [[:Category:Fonts]], [[Fonts]], [[OLPC Human Interface Guidelines/The Sugar Interface/Text and Fonts|HIG-The Sugar Interface/Text and Fonts]]

==== Free Fonts ====

Free fonts are available for most scripts in the world, though some fonts are [[#Licensing|licensed]] incorrectly for completely free redistribution.

==== Need for Screen Fonts ====

Applications and content should be usable on other screens everywhere, not just on OLPC's high resolution screen. Therefore the OLPC community needs to work together on extending the coverage of high quality screen fonts. The [http://dejavu.sourceforge.net/wiki/index.php/Main_Page "DejaVu"] font family (derived from Bitstream Vera) covers most [http://en.wikipedia.org/wiki/Latin_alphabet Latin alphabets] and some other languages. This family has in general good "hinting" for screen use. The Red Hat "Liberation" family recently became available to help substitute for the Microsoft family of fonts, but does not yet have very wide coverage.

[http://www.sil.org/computing/catalog/show_software_catalog.asp?by=cat&name=Font SIL International] also builds fonts for a number of additional languages of local interest.

Helping with these or other efforts to build fonts or to increase coverage of existing fonts is greatly appreciated. Pooling efforts on hinting glyphs, which is boring but important work, and/or donations and buyouts are also being investigated.

=== Keyboards===

[[OLPC Keyboard layouts]] document OLPC's currently available keyboard layouts: further layouts are a modest amount of work if there are existing designs for those languages. People with local expertise will need to work with OLPC staff to generate new layouts.

: See also: [[:Category:Keyboard]], [[OLPC Human Interface Guidelines/The Sugar Interface/Input Systems#Keyboard|HIG-Input Systems-Keyboard]]

=== Input Methods ===

An input method is software that allows typing of scripts with many more characters than keyboard keys. Examples include languages such as Chinese, Japanese, and Korean.

Free software systems now are using [http://www.scim-im.org/projects/imengines SCIM - Smart Common Input Method Platform]. SCIM is replacing older input method systems.

Knowing what languages are taught as “foreign” languages, as well as are native in an area is needed to design keyboards that are most useful in each country. For example, the Nigerian keyboard is designed to allow easy entry of English, Hausa, and Yoruba, which are common languages in much of Nigeria. The "US/International" covers most of the western European languages.

Some issues remain in our base technology. For example: Arabic ligatures could present problems: by avoiding putting them on the keyboard we avoided the need for an input method. However, such workarounds may not be feasible for your language.

: See also: [[Input methods]], [[OLPC Human Interface Guidelines/The Sugar Interface/Input Systems|HIG-Input Systems]]

== [http://en.wikipedia.org/wiki/Accessibility#Telecommunications_and_information_technology_access Accessibility] and [http://en.wikipedia.org/wiki/Usability Usability] ==

=== Speech Synthesis ===

Speech synthesis has a set of complex tradoffs of synthesizer size versus fidelity versus effort to localize a new language.
See [[Speech synthesis]].

: See also [[:Category:Accessibility]]

=== Music and Sound Samples ===

We want much more than dead white male western instruments for dead white male composers!

Clean samples of your musical instruments and music needed!

Samples need appropriate [[#Licensing|licensing]] terms.

: See also [[TamTam: Sounds]]

=== Dictionaries, Spelling Checkers, Thesaurus ===

There is existing support for most major languages.

Spelling, Hyphenation, Thesaurus dictionaries may be needed for different parts of Linux, which may or may not apply to OLPC directly; for example you can check:

* [http://aspell.net/man-html/Supported.html '''aspell''']
* [http://dictionaries.mozdev.org/installation.html '''mozilla''']
* [http://www.abiword.org/languages.phtml '''abiword''']
* [http://wiki.services.openoffice.org/wiki/Dictionaries Open Office]

Of these, the first three are most immediately interesting to OLPC, as we use versions of these codebases as part of the Sugar environment.

=== Character Recognition ===

Stroke/character recognizer localization is of some interest with the pen/tablet: in the future (Gen 2) when we have a touch screen they will become essential. [ftp://ftp.handhelds.org/projects/xstroke/release-0.5/ xstroke] is one such individual character/stroke recognizer, sufficient for alphabets of up to about 100 characters.

== Considerations ==

=== Current Shortcomings ===

There are some real shortcomings where help is needed. These include:
* Non-Gregorian [http://en.wikipedia.org/wiki/List_of_calendar_systems calendars]
* Non-Latin digits (Roozbeh Pournader has patches, but these are not yet integrated and may need help).
* and the sheer scale of the localization problem will eventually require changes in free software projects.

=== Localization Techniques ===

It only takes a small team to localize Linux for a language: e.g. Welsh, Icelandic, which are relatively small languages, have been pretty fully localized by small teams.

You can do the work yourself, hire the work out, or find volunteers among universities (worldwide), the world wide internet and free software community. Add to existing projects whenever possible. By checking with some of the major free software projects (e.g. [http://live.gnome.org/TranslationProject Gnome], [http://l10n.openoffice.org/ OpenOffice], [http://www.mozilla.org/projects/l10n/ Mozilla], [http://l10n.kde.org/ KDE]), you can often locate people already at work in your language.

Work directly in the software and content projects whenever possible. This makes your work available worldwide, while lessens the ongoing work. If you keep your localization work local, others cannot benefit from your work and effort and your software and content will be that much harder to localize.

=== Tools ===

Some example tools include [http://pootle.wordforge.org/ pootle], [http://kbabel.kde.org/ kbabel] and rosetta.
Most software uses the GNU “gettext” libraries and standard .po files, including Sugar; Firefox and OpenOffice have their own systems for historical reasons. [http://www.wordforge.org/drupal/ Wordforge] is a good place to get plugged into tools and the community efforts.

The [http://www.unicode.org/cldr cldr project] is worth watching, though OpenOffice is the first major project using this.
Remember, contribute your translations to the “upstream” projects to minimize long term effort: share your work with the world. Do not presume that if one Linux distribution has your effort that you are finished; some Linux distributions are not good about working with the community that builds and distributes the original software.

=== Licensing ===

Translated strings will often be useful among many projects, not just the the project you are working on translating, therefore, since the MIT/BSD (3 clause) licenses are usable by all projects, these are the safest licenses to use for translation to enable widest sharing.

The [http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&id=OFL SIL OFL license] recommended for Fonts. An often overlooked issue with fonts is that they are incorporated into documents themselves (for example, into PDF documents) and that therefore licensing needs to be considered carefully.

: See also [[Software licensing]]

== Next Steps ==

Localization is by nature local: but languages often crosses borders. Please contact [[User:Jg|Jim Gettys]] to identify issues.

We need to identify people/organizations responsible for language, translation, keyboards, and speech synthesis, as well as effective free software community leaders to help with local deployment and "on the ground" knowledge.

=== Sugar Localization ===

Sugar and Sugar applications use standard .po files, and can be localized using the usual [[#Tools|tools]]. [[Sugar_18n]] goes into the details of the localization process.

=== General Linux Localization ===

By looking at the [http://www.gnome.org/i18n/ gnome], [http://www.mozilla.org/projects/l10n/mlp.html mozilla], [http://contributing.openoffice.org/native-lang.html OpenOffice], [http://l10n.kde.org/ KDE] projects, you can get plugged into translating other Linux software of general interest.

=== Localization of [[Python]] ===

:See [[Python i18n]] for details and a step-by-step example.

== Current l10n projects ==

=== library exchange ===

* [[Localization/Library|Library strings]] -- header and descriptive strings for an [http://dev.laptop.org/pub/content/Library/ OLPC sample library]. Includes some ''PO-like'' strings for the following sections:
** [[Localization/Library/sidebar po|sidebar]] &mdash; en | es | ko | pt | ar
** [[Localization/Library/biology po|biology]] &mdash; en | es | ko &mdash; '''to review:''' pt
** [[Localization/Library/books po|books]] &mdash; en | es | ko &mdash; '''''wanted:''''' pt
** [[Localization/Library/games po|games]] &mdash; en | es | ko | pt
** [[Localization/Library/nature po|nature]] &mdash; en | es | ko | pt
** [[Localization/Library/atlas po|atlas]] &mdash; en | es | ko | pt
* [[Localization/www.laptop.org]] -- The l10n effort for the [http://www.laptop.org new www.laptop.org website]
* We can't translate everything, but we sure want to hear what you would like to see translated into your language. If you got a [[Translating#suggested translations|translation to suggest]] please let us know!

=== activities ===

Add / include links to upstream localization where appropriate.
* [[Localization/Library/camera po|camera]] &mdash; en | es | ko | pt | zh-CN
* web?
* read?
* write?
* blockparty?

=== games ===

* [[Kuku]]

== See also ==

* [[Translators]] & [[Translating]] for the [[localization]] of this wiki.
* [[Languages]] for information about them and how they relate to each country and the [[localization]] effort.


[[Category:Countries]]
[[Category:Language support]]
[[Category:Languages (international)]]

Revision as of 09:52, 19 September 2007

  This page is monitored by the OLPC team.
  english | español |日本語 | 한글 HowTo [ID# 65592]  +/-  

Internationalization technology is the technology for representing and composing the languages spoken, taught or used in your countries. Localization is the process of taking software or content and adapting it for local use. It involves fonts, script layout, input methods, speech synthesis, musical instrumentation, collating order, number