User:DanielDrake/Language packs

From OLPC
< User:DanielDrake
Revision as of 09:18, 2 March 2011 by DanielDrake (talk | contribs) (New customer wants new locale/translations)
Jump to: navigation, search

This page attempts to explain why producing good "language packs" for OLPC software releases is difficult, and documents the open questions and implementation barriers which have not yet been solved. After, I present my opinion that language packs are not the appropriate solution, and that we already have the appropriate tools and processes to meeting customer needs through other means.

What is a language pack?

First, what is meant by a language pack?

Originally, OLPC shipped all glibc-known system locales, and all available translations that arrived through system packages and Sugar activities. The language packs that were once in existence were actually simply just translation updates - they did not (and could not) define any new system locales.

More recently, OLPC has reduced itself to only shipping a fixed set of locales, ones that are used in active deployments by OLPC's current customers. This results in system packages only installing translations related to those locales (Sugar activities still install all available translations, but a patch to make it only install translations for available locales seems like a good idea). Thus the discussions have quietly shifted from the idea of a language pack being an installer for new/updated translations, to an installer which adds more system locales and installs new/updated translations. As far as I know, no implementation efforts have resulted from the discussions.

General issues

How do we execute language packs?

Another concept that needs clarifying is: how should language packs get executed?

The earlier implementations were shell scripts that had to be run by hand on every laptop where they were installed. You had to go to the terminal, become root, and run a command. Root access is required, thus massively complicating any efforts we might put in place to make it automated.

Another possible application for language packs is for them to execute from within late stages of the build system, rather than making them something that can be executed on an individual laptop. This would avoid some (but not all) of the problems outlined on this page. However, the earlier implementation of language packs could not be run in the build system environment due to their design.

Are there actually any translations worth shipping?

In some cases where the argument for language packs is made, the translations for the language in question are severely lacking and need some concentrated effort to be brought up-to-scratch. In these cases, the deployment team coordinating the translations could hook up with OLPC's development team, hooking into the schedule of the next software release, avoiding all need for language packs. In other cases, OLPC has shipped entire minor releases based on locale/language requirements from customers.

Locale issues

Defining new locales in a language pack

We don't have any code for defining new locales in a language pack. The following is an estimate of the implementation that would be needed:

  • The language pack would need to include the local definition file and character map file from the glibc source
  • The language pack would then run localedef on the target system to install the locale

Do we have to develop the locale first?

At various points during OLPC's history, we not only have faced the challenge of adding a new locale to our existing software, we have actually had to define the locale in the first place. This consists of detailing things like local time formats, currencies, alphabet, etc, and contributing it to the glibc project. This will certainly require input from the region in question. This adds to my argument that language is a core part of the system and is not something we should try to bolt-on. Locales in this state often will have exactly 0 translations as well.

glibc locale archive

If we are to use a language pack to add a new locale to an existing installed system, some files on the root filesystem such as /usr/lib/locale/locale-archive will get modified.

When system files are modified in this way, the filesystem contents become out-of-sync with the contents manifest of the installed build. The pristineness is lost. If olpc-update is then used in future, the update process will take a lot longer, as the system will have to do a more exhaustive update after realising that the initial, incremental update failed.

System translation issues

Which translations to ship?

Now that OLPC is shipping a reduced set of locales and we hence arrive at the idea of having a language pack capable of installing new locales, we are faced with the problem that any newly-installed locales would not have any system translations available. This is because they were stripped at build time. Therefore, any language pack which defines a new locale also needs to ship a set of system translations. Which system translations should be shipped?

One obvious candidate is Sugar, and the implementation is obvious: grab the latest from Pootle. But what else should be shipped - GTK+? GNOME? Etoys? Where would these translation files be sourced from, and how would this be implemented?

Where to install translations?

In the case where a language pack is used to ship updated translations, we are faced with a question of where they should be installed. If we overwrite translation files that came pre-installed, we will go out of sync with the contents manifest and lose pristineness (see above). We'd want to install them in another location, and make gettext use that other location as a priority before looking in the normal /usr/share/locale. Sayamindu investigated this in the past and concluded that this functionality is not currently available in gettext.

Activity translation issues

Where to install translations?

Where should a language pack install its translations?

If it overwrites the translations installed at /home/olpc/Activities/Foo.activity/locale/, a variety of undesirable situations become possible, such as the following:

  • Deployment ships unmodified OLPC image pointing at OLPC's base set of activities
  • Deployment adds language pack with new translations, overwriting the translation shipped with the activity
  • OLPC performs a point-release of the activity where a single bug is fixed, activity group is updated with the new version
  • Deployment users run "software update" and their updated translations are overwritten with the old ones

One solution to this would be to make it possible to install activity translations somewhere separate, asking gettext to prefer translations from this alternative location over the ones installed by the activity (after modifying gettext to make this possible). But then we generate another set of undesirable situations, such as:

  • Deployment ships unmodified OLPC image pointing at OLPC's base set of activities
  • Deployment adds language pack with new translations, installing the new translations in a prioritized special location
  • New activity is released including various bug fixes and an even more extensive translation
  • Activity continues to use stale translations installed by the language pack

Which translations to ship/install?

Which activity translations should be included in a language pack? All of the ones on Pootle? Just a select few?

If a language pack includes translations for an activity that is not installed, what should it do? Install the translation anyway (currently not possible, but any work on resolving the previous question might make it so)? Ignore and continue?

My opinion

In my opinion, language is a key part of the system. Language is not something that can feasibly be bolted on afterwards. Even if we solved the above issues, the result would be messy. It would be duplicating various parts of the build and translation infrastructure outside of their respective homes.

OLPC already offers the clean solution to all of these problems. However, some improvement in OLPC's practice is called for to improve the customer experience. This is detailed below:

Existing customers want updated translations

One of the use cases for the earlier language pack implementation was that systems got installed with translations which have since been significantly updated/extended.

In the 8.2 days, OLPC planned for a new major release every 6 months. This possibility was lost as resources were downsized, but now appears possible again. With such release frequency, deployments and translators can simply get involved with the development cycles and perform their translations there. Once the release is made, it can be deployed, already including the latest translations. The deployment could be done automatically using olpc-update over the entire project. As translation efforts often take months, having a gap of 6 months between every translation update seems reasonable.

If 6 months is too long or other issues come into play, the customer could push all the new translations and request a minor release. OLPC has made minor releases entirely dedicated to localization in the past.

Usually the most critical translations to update are those that belong to activities. Those can be done completely regardless of other schedules: push the new translations, get a new release made, add that to the activity group, update the laptops with that version.

Existing customers want new locale/translations

I can't think of any examples, but there could undoubtedly be cases where a new deployment initially ships English-only laptops in its pilot stages, then wishes to add its own locale and translations. In this case, OLPC and the customer could develop any required locales, get it included in the next software release (expecting one every 6 months, or possibly doing a more immediate dedicated minor release for this customer), and then distribute it via olpc-update and Sugar activity updates to the existing laptops.

New customer wants new locale/translations

A common case is where a new customer orders laptops from OLPC, and no deployment has been run in that country/language before, therefore the current OLPC build is unsatisfactory in this respect. glibc locales may need to be developed, translations will need to be done (perhaps even from zero).

Deployments are never instantaneous, many months pass inbetween the point of deciding to implement the project, to actually receiving the laptops, to being able to hand them out to children. So, OLPC should be able to communicate this at a very early stage to the new customer: get a technical team on the case, or sponsor OLPC to do this work for you, or perhaps OLPC would offer to do it for free based on collaboration with the local team. Thus the work could start early enough, giving OLPC or the local developers enough time to integrate everything for the next major release, or to produce a minor release targetted at that deployment.

In my experience, countries are happy to temporarily ship software entirely in English if there is a path to getting it localized in the short term. This issue is further diluted by the fact that new deployments often only deploy to a small number of users in the first few months, allowing even more time for the translation/localization work to be completed according to existing practices before the deployment expands beyond a small number of classrooms.

Summary

  • OLPC already has all thet tools and processes to meet customers needs, without requiring language packs, but some things could be improved:
  • Any relevant language-related considerations should be communicated to new customers at an early stage, providing ample time for the issues to be resolved
  • Further transparency in the Release Process will help deployments get involved, ensuring translations are good on release day and don't need to be heavily modified later
  • OLPC could probably do more to engage deployments in the software development process, in tune with the previous point
  • OLPC should encourage more use of olpc-update (perhaps calling for documentation improvements). I expect many customers don't realise that even today it can be used for pushing fully automated OS updates to all laptops in a deployment without having to touch any of the laptops.
  • The XS should offer an out-of-the-box theft deterrence server which automatically offers OS updates that have been installed (planned to be fixed for XS-0.7, I think), greatly reducing the difficulty of implementing the aforementioned fully automated olpc-update OS distribution system
  • Sugar's activity updater is good but the lack of a way to push automatic activity updates becomes more painful in light of the above