Localization: Difference between revisions

From OLPC
Jump to navigation Jump to search
No edit summary
(Start adding Localization issues content from the countries meeting.)
Line 4: Line 4:


[[#1|English]] | [[#2|espanol]] | [[#3|Korean]]
[[#1|English]] | [[#2|espanol]] | [[#3|Korean]]

== Problem Statement ==

Internationalization technology is technology for representing and composing the languages spoken, taught or used in your countries. Localization is the process of taking software or content and adapting it for local use.

Localization involves fonts, script layout, input methods, speech synthesis, musical instrumentation, collating order, dictionaries, and spelling checkers, among other issues.

Linux is already more widely localized than Microsoft Windows since no cooperation from a vendor is required to do so: having said this, cooperation with the free software and content community is vital to reduce overall work required.

The size of the problem is huge. [http://www.ethnologue.org/ethno_docs/distribution.asp?by=size Ethnologue] has extensive information on the languages of the world.

==Localization Topics==
This is an outline of (some of) the topics and tools, and issues of localization.
===Character Sets===
Unicode is fully supported in “modern” applications and toolkits used in free software.
Legacy character set support also present, but modern applications are use Unicode.

Collation order (the sorting order when text is sorted by Linux) is generally well supported in the C library.
===Script Layout===
OLPC primarily concentrates on using the [http://www.pango.org/ Pango library], which is able to layout most “hard” languages, including: Arabic, the Indic languages, Hebrew, Persian, Thai, etc. It has a modular puggable layout engine and supports vertical text, bi-directional layout is supported. Overall, some issues remain – but overall Pango in pretty good shape and can handle most scripts already.
===Fonts===
To share content and preserve cultural heritage OLPC's goal must be and is full coverage of all the world's languages.
Linux using the [http://www.fontconfig.org/wiki/ Fontconfig] system has a better concept of language coverage of fonts than other systems. This system is used to configure the font system and determine what set of fonts are needed to cover a set of languages.

The formats of fonts supported on Linux include OpenType, TrueType and many others: see [http://www.freetype.org/ Freetype] for details. Most of the current font formats supported by Freetype are obsolete, and by far the best results on the screen will be had from OpenType and TrueType format fonts. Type 1 fonts are useful primarily for printing; the renderer for Type1 fonts in Freetype we have today is not very good, and Type 1 does not support programmatic hinting for low resolution screens.

====Free Fonts====
Free fonts are available for most scripts in the world, though some fonts are licensed incorrectly for completely free redistribution.
OLPC itself has a relatively high resolution screen; this helps us considerably, particularly in grayscale mode at 200DPI.
====Need for Screen Fonts====
However, we also need our applications and content to be usable on other screens everywhere, so we need to work together on extending the coverage we have today on high quality screen fonts. The [http://dejavu.sourceforge.net/wiki/index.php/Main_Page "DejaVu"] font family (derived from Bitstream Vera) covers most Latin alphabets and some other languages. This family has in general good "hinting" for screen use.
[http://www.sil.org/computing/catalog/show_software_catalog.asp?by=cat&name=Font SIL International] also builds fonts for a number of additional languages of local interest.

Helping with these or other efforts to build fonts or to increase coverage of existing fonts is greatly appreciated. Pooling efforts on hinting glyphs, which is boring but important work, and/or donations and buyouts are also being investigated.
===Keyboards===
===Speech Synthesis===
===Input Methods===
===Current Shortcomings===
===Music and Sound Samples===
===Dictionaries, Spelling Checkers, Thesarus===
===Localization Techniques===
===Character Recognition===
===Tools===
===Licensing===
===Next Steps===
====Sugar Localization====
====General Linux Localization====






== Current l10n projects ==
== Current l10n projects ==
Line 24: Line 75:
* write?
* write?
* blockparty?
* blockparty?

==Lo



== Country groups and descriptions ==
== Country groups and descriptions ==

Revision as of 15:00, 27 April 2007

This article is a stub. You can help the OLPC project by expanding it.

This is the Intro page for Localization of the OLPC. This needs filling out.


English | espanol | Korean

Problem Statement

Internationalization technology is technology for representing and composing the languages spoken, taught or used in your countries. Localization is the process of taking software or content and adapting it for local use.

Localization involves fonts, script layout, input methods, speech synthesis, musical instrumentation, collating order, dictionaries, and spelling checkers, among other issues.

Linux is already more widely localized than Microsoft Windows since no cooperation from a vendor is required to do so: having said this, cooperation with the free software and content community is vital to reduce overall work required.

The size of the problem is huge. Ethnologue has extensive information on the languages of the world.

Localization Topics

This is an outline of (some of) the topics and tools, and issues of localization.

Character Sets

Unicode is fully supported in “modern” applications and toolkits used in free software. Legacy character set support also present, but modern applications are use Unicode.

Collation order (the sorting order when text is sorted by Linux) is generally well supported in the C library.

Script Layout

OLPC primarily concentrates on using the Pango library, which is able to layout most “hard” languages, including: Arabic, the Indic languages, Hebrew, Persian, Thai, etc. It has a modular puggable layout engine and supports vertical text, bi-directional layout is supported. Overall, some issues remain – but overall Pango in pretty good shape and can handle most scripts already.

Fonts

To share content and preserve cultural heritage OLPC's goal must be and is full coverage of all the world's languages. Linux using the Fontconfig system has a better concept of language coverage of fonts than other systems. This system is used to configure the font system and determine what set of fonts are needed to cover a set of languages.

The formats of fonts supported on Linux include OpenType, TrueType and many others: see Freetype for details. Most of the current font formats supported by Freetype are obsolete, and by far the best results on the screen will be had from OpenType and TrueType format fonts. Type 1 fonts are useful primarily for printing; the renderer for Type1 fonts in Freetype we have today is not very good, and Type 1 does not support programmatic hinting for low resolution screens.

Free Fonts

Free fonts are available for most scripts in the world, though some fonts are licensed incorrectly for completely free redistribution. OLPC itself has a relatively high resolution screen; this helps us considerably, particularly in grayscale mode at 200DPI.

Need for Screen Fonts

However, we also need our applications and content to be usable on other screens everywhere, so we need to work together on extending the coverage we have today on high quality screen fonts. The "DejaVu" font family (derived from Bitstream Vera) covers most Latin alphabets and some other languages. This family has in general good "hinting" for screen use. SIL International also builds fonts for a number of additional languages of local interest.

Helping with these or other efforts to build fonts or to increase coverage of existing fonts is greatly appreciated. Pooling efforts on hinting glyphs, which is boring but important work, and/or donations and buyouts are also being investigated.

Keyboards

Speech Synthesis

Input Methods

Current Shortcomings

Music and Sound Samples

Dictionaries, Spelling Checkers, Thesarus

Localization Techniques

Character Recognition

Tools

Licensing

Next Steps

Sugar Localization

General Linux Localization

Current l10n projects

library exchange

activities (include links to upstream localization where appropriate)

  • camera — en | es | ko | pt
  • web?
  • read?
  • write?
  • blockparty?

==Lo


Country groups and descriptions


Korean-based Nations and Regional Communities using Korean

Korea map.gif

People using Korean as their native language are those in South Korea (한국인) and North Korea (조선인). Some Chinese and those with other nationalities, living in the Nothern part of Korea also are using Korean as their second language, because of some historical issues. They are called as 고려인(Korea-in) and 조선족 (Chosun-zok or Korean Chinese) respectively.

Currently OLPC Korea (or XO Korea) is covering all those nations and regions. In a near future, we hope there will be regional XO groups for those.


There's also a matrix to keep track of Translated pages.