Pootle

From OLPC
Revision as of 11:51, 22 October 2007 by Xavi (talk | contribs) (setting up: tweaking text)
Jump to: navigation, search
IRC #olpc-content

NOTICE 
The use of Pootle is currently under test and should by no means be taken or considered to be part of the Localization process for the XO.

These are the notes taken and rough sketches for the processes involved in the localization effort in order to use Pootle as a more liberal L10n platform.

There are several scenarios that depend on the roles and their associated responsibilities (ie: translators, coders, administrators, etc). Below we outline the two most important ones from the POV of the translators that we could classify as either the opportunistic translator (fixing a typo or translating a few missing strings) and a registered translator (that is somehow more committed with the whole l10n reality).

If we consider that in the ultimate case children will be dealing with code (thus with gettext), we want things and processes to be as simple and straight forward as possible, so that in the end anybody will be able to translate or at least localize content to some degree. With this in mind, we aim for an environment that will:

  • allow anybody to make suggestions,
  • allow registered and trusted users to review and make translations, and
  • administrators to commit.

If the quality is not satisfactory, we can probably revert to more 'traditional' and bureaucratic structures; or try to develop tools that guarantee or hint appropriately without sacrificing the liberty and agility that will be required by children.

i18n & L10n

Most people will argue that localization (L10n / l10n) concerns and issues should be tackled way before coding is finished, and quite a few will categorically say that it will be too late by then. Avoiding the philosophical and formal aspects, and taking a more pragmatic approach, the truth is that L10n is more often than not an afterthought and internationalization (i18n) an annoying pre-requisite... that has to be done.

In the context of the OLPC XO software and activities, and for practical purposes will avoid the debate on exactly where does L10n meets the code and start from the point where the output of i18n is used as the input for the l10n process. In other words, we'll elaborate based on the simplistic model that goes along the line:

  1. generate POT from the source code (output of the i18n process)
  2. generate PO based on POT
  3. translate / populate the PO with locale specific content
  4. generate the final PO (or MO) (output of the l10n process)
  5. the executable code uses the PO (or MO) to comunicate with the final user

Developers

Developers willing to read a bit more about i18n & L10n, could probably profit from Wikipedia's overview, sourceforge's practical view, KDE i18n guide, GNU gettext manual and other sources.

Regardless of how much or how good the i18n effort is, the bottom line from Pootle's point of view is that it needs one (or more) .POT files. Needless to say, developers should be aware that localization is NOT just about strings, as it includes plurals, numbers, dates, currencies, text-flow, scripts, fonts... and that it needs to be tested (ie: pseudolocalization testing) that needs to be considered in the overal development cycle. Also, if you want your activity to be properly localized, think about commenting your strings.

Translators

Being able to understand or grasp the intended meaning of some text in a language does not make you a translator, but "In the land of the blind, the one-eyed man is king." and all collaborations are welcome, although we also recommend reading some guidelines and documentation that may at least help you avoid some of the basic or typical pitfalls: L10n guide, translation guide, more links to come...

Basic Scenarios

Observations

  • You can download the PO, so there's no forcing of the on-line UI (although you need the upload right to inject it back.
  • In the translation interface, fuzzy entries are grayed out and there's a gray vertical line separating the terms

Opportunistic translator

This user just wants to help. She/he doesn't want to get tangled in the administrative tasks. The only possible collaboration available is to suggest translations (which will be reviewed by users who have been granted the Review permission in a particular language).

After navigating to the olpc pootle server project: olpc language: spanish to finally reach a file (ie: TamTamSynthLab)

The interface will display a series of PO entries (one will have the focus—if it doesn't, hovering over one will make an 'Edit' link appear that enables it—showing an entry field, and the following controls:

a picture is worth 1000 words

  • Back & Skip buttons — jump to the previous or next entry
  • Copy button — copies the original (msgid) value and continues in edit mode
  • Suggest button — the actual collaboration of suggesting a translation
  • Fuzzy checkbox — denotes that the suggested translation is/is not 100% trustworthy
  • Special characters specific to the language (see #Languages)
  • grow / shrink — allows growing and shrinking of the entry field (see #User options
  • Translator comments field — comments either extracted from the source code, or added by other translators

The opportunistic translator then proceeds to navigate the file/s entering suggestions (to be processed later by the reviewers).

Admin notes 
The Suggest permission must be granted on a per-language-project basis.
Each language may have specific or special characters that are may not be available in the user's keyboard, but can be provided for in the #Language specification.

Registered translator

Except for the mandatory pre-condition to register, which enables the extra [Submit] button when translating, the overall process is quite similar to that of the #Opportunistic translator. On the other hand, several user-specific permissions may apply (ie: off-line translation) and the GUI will adapt and offer them. Also, as a registered translator, you may be assigned specific files or strings to translate and/or review, allowing to better coordinate the overall effort.

Admin notes 
In a collaborative and low-entry-barrier process, the administrator/s should enable the default user to actually translate — Navigation: admin | projects | project_name | language | permissions +Translate.
The default #user permissions apply to any registered user (unless they are overridden in a case-by-case approach). In the out-of-the-box install they include: View, Suggest, Archive & Compile PO files

User Scenarios

Register as a translator

  1. Head towards the pootle site
  2. Follows the register link where you fill the following fields: Username, Password, Confirm password, Full Name & Email Address, clicking on the [Register Account] button that will send a confirmation message to the email address with the activation code.
  3. Following the link in the email will activate the newly created account. After which it will ask you to login and take you to your user page.

User options

In your user page there's a link to Change options that will allow you define some things:

  • Projects you wish to participate (for the moment only olpc should be considered a valid option)
  • Languages you wish to collaborate in. If you don't find your language, as an administrator you can associate it to the project (see #Languages) or as a mere mortal, you should contact an administrator.
  • Other options like personal data & translation UI are present.

After configuring, don't forget to hit the [Save changes] button to make them effective, after which you can return to your #user page by means of the Home Page link.

User page

aka: 'Home', or 'My account' page.

You can reach your user page following the Home or My account link (depending on where you are) and it will show the appropriate links to the selected projects grouped by languages:

  • The language link takes you to the statistics page per project
  • the project-in-a-language link takes you the the statistics page displaying all its files

Advanced User Scenarios

Reviewer

A user who has been granted the Review permission, may accept or reject the suggestions made by users (who must have the Suggest permission). The way to do this is there another? is to go to a language/project combination (ie: spanish-olpc) and follow the Show Editing Functions. Here you have two alternatives: review the whole set of suggestions in the language-project set, or work on the suggestions present in a specific file (ie: all suggestions or TamTamSynthLab suggestions).

Once in the review UI, the reviewer has four options: Accept, Reject, Back or Skip which are self evident, but nevertheless will mention:

Accept the reviewer translator accepts the suggestion and is registered
Reject the suggested translation is rejected does that mean erased for the user or the file?
Back goes to the previous suggestion assume a previous non-accepted/rejected suggestion
Skip goes to the next available suggestion

In the case where multiple suggestions have been made, each one has its Accept & Reject buttons, while only the last will have the Back & Skip buttons.

Regardless of the multiplicity of suggestions, each suggested string will be diffed with the current string highlighting the changes. It also displays (if available) the name of the user that made the suggestion.

Editing functions

This section of the UI displays on a per-file basis several functionalities that depend on the permissions of the user:

Translate My Strings enter the translate UI limiting the visibility to those strings assigned to the user. explore
Quick Translate My Strings same as above, unknown difference. explore
Quick Translate enter the translate UI over the whole set? explore
Translate All explore
PO file used for off-line translators in order to retrieve the whole .PO file.
XLIFF file used for off-line translators and interfaces to retrieve the whole file in said format. explore
Qt .ts file explore
CSV file explore

Show checks

This functionality is a real helper (at least for latin scripts) as it runs a series of checks on the translations. So Pootle besides the obvious verification of translated+fuzzy+untranslated check, it verifies:

  • simplecaps — for extra capital letters
  • startcaps — initial capital letter matches between source & translation
  • startpunc — if the original doesn't start with a letter, the translation probably shouldn't either
  • unchanged — the source & translation are the same
  • others!! — there are several others! Apparently, from traces in a somefile.po.stats the checks performed are:
    check-validchars, check-numbers, check-unchanged, check-doublespacing, check-purepunc, check-isreview, check-nplurals, check-brackets, blank, check-endpunc, check-xmltags, check-escapes, check-spellcheck, check-endwhitespace, check-functions, check-doublewords, check-singlequoting, check-simplecaps, check-blank, check-emails, check-startwhitespace, check-accelerators, check-long, check-musttranslatewords, has-suggestion, check-puncspacing, check-notranslatewords, check-variables, check-doublequoting, check-kdecomments, check-short, fuzzy, check-untranslated, check-simpleplurals, check-sentencecount, check-isfuzzy, check-startpunc, check-compendiumconflicts, check-tabs, check-newlines, check-urls, check-filepaths, check-startcaps, translated, check-printf, check-acronyms, and sourcewordcounts, targetwordcounts.

Obviously, these checks, as any automated language process, are to be taken as a guidance and not as a rule.

Zip of folder

Actually, any 'grouping' of PO files may be downloaded as a ZIP file if the user has the archive right. In other words, you can download the files in a language, goal, file, etc. It may be possible that Pootle extracts and recombines from several files in order to provide the zip with say 'My Strings' for offline work.

Administrator Scenarios

Primary administration

This administration level refers to the process of ensuring a functional and operative interface between the development environment and the translating environment. In other words, it must cover the largest probable set of situations and describe the appropiate steps to cope with it. Some things that need to be defined before starting are:

Number of Pootle-projectsoptions are:

  1. single 'mega-olpc' project where all d.l.o. projects are kept
  2. one-to-one mapping between a d.l.o & pootle project
  3. ad-hoc granularity in the mapping (ie: core, bundled, extras, prototypes, etc.)

Some of this granularity may be handled by the concept of goal in Pootle (but they only work within a project+language context). On the other hand, although less likely, is that in a 'mega-olpc' we may be faced with name conflicts (ie: two or more projects using the same filename). This could be solved by adopting a standard naming convention within Pootle(ie: prefixing all Pootle files with d.l.o's path to the project).

Number of Pootle-languagesoptions are:

  1. strict minimal number for the 'green' countries (ie: Amharic, Arabic, English, Spanish, French, Hausa, Hindi, Igbo, Nepali, Portuguese, Romanian, Russian, Kinyarwanda, Thai, Urdu, & Yoruba)
  2. the above plus certain 'typical' languages (ie: german, japanese, etc.) plus some 'red' or 'orange' languages
  3. totally free as long as a language administrator can be identified and named for each language

Notification channels and/or mechanisms:

  • developer's changes to the POTs (creation, changes and elimination) must reach the translating environment.
  • translator's work and generation of POs (and possibly MOs) must reach the development environment.

How? This could be performed through the filing of tickets, sending mails, scripts monitoring both sides, etc. And in the future it may be modified to include or consider the back-end integration to git. A clear primary channel must be determined. The periodicity and/or milestones for re-injecting the POs back into the development environment for their testing must also be determined.

d.l.o
project
POT PO Actions
New none none open a d.l.o ticket requesting missing POT file
> 0 PAD & developer should decide if a PO may be used as the basis to generate the POT.
If appropriate, PAD creates the POT.
: switch to 'new, POT > 0 and PO > 0' case below.
If not, open a d.l.o ticket requesting missing POT file.
Developer eliminates offending PO(s).
> 0 none PAD saves POT in po/project/templates.
Performs a update from templates (in order to generate the required PO for each language
Notifies the LADs who can then decide to include into their goals, assign it, etc.
> 0 PAD processes the POT as above.
: switch to 'old, POT none and PO > 0' below.
Old updated none PAD saves POT in po/project/templates.
Performs a update from templates (in order to update existing POs).
PAD notifies LADs of changes as it may change the workload of goals and assignments.
> 0 PAD saves POT in po/project/templates.
Performs a update from templates (in order to update existing POs).
PAD notifies LADs of changes as it may change the workload of goals and assignments.
: switch to 'old, POT none and PO > 0' case
none none ignored — nothing to do
> 0 LAD verifies that PO matches existing POT.
If valid, uploads the PO through the web-GUI (either merging or overwriting the exising file)
Notes 
The columns POT & PO refer to the external or incomming files generated outside of Pootle.
If a POT exists in Pootle, the corresponding POs must exist for each language.

Terminology & roles used

POT & PO 
name given to the files depending on the purpose: Portable Object Template, for 'master copies'; Portable Object for their localized versiones (one per language). naming convention would probably be "xx.po"
project administrator (PAD) 
person in charge of overseeing that a project has the correct (latest) set of POT files and coordinates with the language administrator in order to ensure that the PO files reflect the POTs. Has also the responsibility to decide which languages to use.
to be determined if there's a single PAD for all d.l.o projects or more. Every d.l.o project should have one PAD.
language administrator (LAD) 
person in charge of ensuring that PO files are available to be translated while taking the appropriate measures to guarantee that the PO reflect the latest POT, merging, reviewing, etc. the translating community effort. Can also assign tasks to specific registered translator
There should be at least one LAD for each of the 'green' languages.
registered translator 
person actually performing the translation of a PO file within a specific project and language.
developer 
person that is (ultimately) responsible of providing POT files

Administration

As an administrator, in your home page you have access to the Admin page which offers: Users, Languages & Projects.

Users 
is a simple interface allowing the manual addition of users, edition of their names & (invisible) passwords (ie: resetting them) and where you can activate, de-activate and remove an user.
Languages 
allows the maintenance of the list of available languages (based on the ISO 639 codes, a descriptive text, special characters (used in the translating UI), defining the number of plurals and its equation. Note: removing a language here would seem to affect only the ability to associate them to projects, with no apparent impact on the previously defined associations.
Projects 
this is the initial page where things start to come together. Besides being able to add a project by defining the values for the Project Code, Full Name, Project Description, Checker Style, File Type and Create MO Files fields, you can also Remove Project. The most mysterious parameter is Checker Style which offers Standard, creativecommons, kde, openoffice, mozilla and gnome as options.

The above covers the broad, high-level configuration, which must be followed by the project and language configuration:

Project languages 
A project needs to be informed of which languages it will have. This is accomplished by following the link of a specific project resulting in a page where you can add them. The only apparent way to remove a language is to delete its directory from outside of Pootle.
Project language permissions 
Each language mentioned above is a link that allows you to configure the #user permissions in said project-language combination. You can grant/remove specific rights to a particular user, that will only apply in said project+language context.
NOTES 
Several languages were removed from the initial (default) list with the intention to limit or focus on the core green languages: am, ar, en, es, fr, ha, hi, ig, ne, pt, ro, ru, rw, th, ur, & yo.
Still pending: Obtain and verify the plural formulas for all.

Defining goals

see Pootle::Goals

Goals are required in order to be able to assign translations, thus enabling the organizing and prioritizing of work within a given language-project (iow, no global goals). Also worth noting, goals seem to be nothing more than a 'bundle name', with no other data like deadlines, comments or anything else.

In order to define a goal you must enable the Show Goals in the project page, and on the right, there's an entry field to add new.

  1. Define a goal in the project page (ie: spanish-olpc) toggle the Show Goals
    • by default a null-goal (ie: Not in a goal) exists as a catch all.
    • on the right, a box with an entry field allows you to give a name and Add Goal.
  2. Assign files to a goal
    • as we are trying to include files into a goal, click on the Not in a goal
    • make sure that the Show Editing Funcions is enabled by clicking on it if necessary
    • once in the language-project-goals view (showing the 'Not in a goal' files), each file has a pull-down list to pick and set the goal.

After setting the goals you can view the files bundled in a goal, and later assign work.

Assigning translations

see Pootle::Assigning

This is a nice administrative functionality that will actually (or probably) make the translator's life simpler. As an administrator you can assign certain users to goals and also to specific files (either to translate everything, unassigned, and a couple other variants).

By assigning users and goals/files, those user will then be able to use special links (ie: Translate My Strings or Quick Translate My Strings) that will feed them with their workload, and thus allow focus where is needed for the project. Still, the interface and handling is not quite describable at this moment as many aspects come together and some things need to be better understood in order to get the most out of it (while trying not to complicate things too much).

how do you remove an user from a given goal?

Setting up version control systems

see Pootle::Version control

These are some notes of the trials, not actual documentation.

Things have been messier than expected. First of all, GIT is not supported in the stable version, and the newer (unstable) version is reorganizing several aspects of Pootle making it too risky to jump onto the bleeding edge... Thus the decision to backport the git support.

On top of this, although volunteer gnrfan joined the fray (doing the backport), our experience with git is way below average and we are totally in the dark about how does Pootle actually obtain and store the POT files. We seem to have found a way (using symlinks in the Pootle subdirectories pointing to the repository files) but this solution although it may work, is a bit obscure and not even hinted by the documentation of Pootle:

To have any sort of integration with version control from within Pootle, it is necessary to check out the translation files into their correct places in their Pootle projects. The CVS or SVN meta files (CVS/ or .svn/) need to be there. This has to be done outside of Pootle.

Our efforts to contact or elucidate some sort of answers from the Pootle community have not been very effective for the moment... but hope is not lost! :)

Fantasyland

The basic idea we are aiming for is the following:

We are assuming that the Pootle server has the ability to read & write throughout the xyzzy_project/po in dev.laptop.org through the git.

The overall process would be:

Developers do their i18n part resulting in one or more .POT files generated in their respective /po directories (and in theory, forget all about l10n).

  1. First time-off (iow, a new project with a .POT)
    • Pootle would sync downloading the .POT and make it available for all languages to translate
    • Pootle makes the POT (now POs) available for each language and are translated
    • Pootle commits the (new) POs to the git server in d.l.o
  2. Updates in the i18n part — new versions of the POT
    • Pootle in its sync process should note the update, download the new version and modify accordingly all the POs and replace the local POT with the 'real' POT
    • Pootle would commit as usual
  3. Updates in the l10n part — d.l.o has a new version of some PO
    • Pootle syncs modifying the local PO using the d.l.o version as the latest valid version and reference, meaning that the local versions need to be corrected against it
      The documentation mentions this is so, passing the (local) differences as 'suggestions')

The current problem we are facing is getting Pootle to read from git, iow, procuring the POTs! We haven't found a way (be it interface or code) to get the POTs. The update and commit are visible in the web GUI...

Pootle workflow as per #pootle channel

Pootle has some very nice features, but integration to the repositories is not really one of them. Particularly the initial bootstrapping of a project. As extracted from a chat with friedel in IRC#poole, the usual / standard workflow with no repositories is as follows:

  1. The POT file is manually injected into /po/project_name/templates
  2. Doing the Updtate from templates for a specific language in a project basically re-syncs the existing PO (doing a merge, and in the case of conflicts demotes them as suggestionsneed to test) or makes available the particular PO for said language (in the /po/project_name/lang_code).
  3. In its origins, Pootle was finished, and the re-injection of the PO & MO files into the original project was left for the language or project coordinator to do.

With the inclusion of repositories, apparently the only part really integrated is the commit phase which would somehow trigger a push of the PO files. Unfortunately the 'pulling' from the repositories is not so automatic. Therefore, the probable (and suggested in IRC) workflow we'll implement requires manual intervention and/or the development of some scripts, and would probably look like this:

  1. Have a local git repository that syncs with d.l.o (as any other repository)
  2. Manually create the project
  3. Manually inject the POT file (most likely a symlink to the POT in the local repository)
  4. Update from templates would be performed for all languages
    This step would actually create (or sync) the PO files for all languages in the project together with some associated internal files to Pootle. The actual PO file should then be symlinked and added to the appropriate place in the (local) repository.
    Another possibility would be to create the project directory as a symlink into the repository (and probably use the .gitignore to filter out the Pootle files). This alternative (which could be simpler) seems to clash with the idea that all POT and PO files are stored in the /po directory in d.l.o because Pootle handles each language as a subdirectory (under the assumption that each project has several POT files, something that even Etoys has reverted and now has just one big POT).
  5. Translate at will & perform commits at will / as necessary, staying alert for:
    • If the POT changes along the way, this should be somehow noted and an Update from templates should be carried out.
    • If a new POT file (not an update) is generated, it must be manually injected together with the corresponding PO files per language.
    • If a POT is deprecated (eliminated) the appropriate local removal should ensue.

The manual nature of Pootle's repository integration is currently the weakest point as it adds administrative overhead and coordination issues. Some of it could be eased by either developing some scripts or modifying Pootle itself (ie: the creation of a new PO file could be tweaked to automatically generate the symlink instead of the local file and the issuing of the appropriate command to add it to the repository; or, just settle for a simple symlink of the project's directory and all files would reside in the local repository). verify any possible conflict in naming conventions of the target PO—particularly with t.fp.o

Twilight Zone

Music queues in... ti-ri ti-ri...
Testing setup 
Is based on a two-layered clone of d.l.o projects. The first clone acts as the pseudo-d.l.o, and the second clone (a clone of the first) is where Pootle files will be symlinked. Currently the first clone (a.k.a. the d.l.o-clone) is in Rafael's home dir, and the second clone (a.k.a. pootle-clone) in Xavi's home dir. As you can't actually clone d.l.o, each project is cloned individually. The way each project is cloned into the pootle-clone git is by first cloning from d.l.o:
#
# For every project in d.l.o (or for testing)
#
cd /home/rafael                                   # as user rafael
git-clone git://dev.laptop.org/git/project_name  # could be manual or by scripts.(clone.sh) 
#
cd /home/xavi/clone2                              # as user xavi
git-clone /home/rafael/project_name project_name
see git-clone doc
NOTE: we used the --local argument (really a no-op, but still got a message about not being able to use hardlinks or something.
Hooking Pootle into the pootle-clone 
This is a manual step involving the symlinking of the appropriate POT file(s) and possible PO previously done.
cd /var/log/pootle                                  # home podirectory specified in /etc/pootle/pootle.prefs
cd olpctest/templates                               # a test directory for OLPC
ln -s ~/clone2/project_name/project.pot unique.pot  # symlinking the POT file and associating an unique name withing Pootle
#
# Enter the GUI and in the project languages perform an update from templates
# this will create the appropriate PO files for each language in 
# the project_name/xx directory (where the xx is the ISO 639 language code)
#
#
# For each PO that didn't exist in a language in the project:
#
cd /var/log/pootle/olpctest/xx
mv unique.po ~/clone2/project_name/po/xx.po
ln -s unique.po ~/clone2/project_name/po/xx.po
#
# If a given language PO existed in the project:
# (and you want to keep it)
#
cd /var/log/pootle/olpctest/xx
rm unique.po
ln -s unique.po ~/clone2/project_name/po/xx.po
Problem with the setup 
Theoretically this should be it, and Pootle would be able to identify the symlinked files as belonging to the git version control system.
In practice, Pootle is verifying at the directory level if something is in a version control system, which in our case fails misserably due to the fact that we are symlinking individual files! The reason that we are symlinking files instead of directories is that the structures differ:
Pootle directories OLPC git directories
/var/lib/pootle/olpctest
  /templates
     /project_name_1.pot
     /project_name_2.pot
  /es
     /project_name_1.po
     /project_name_2.po
  /pt
     /project_name_1.po
     /project_name_2.po
/d.l.o-clone
  /project_name_1
     /po
        /project_name_1.pot
        /es.po
        /pt.po
  /project_name_2
     /po
        /project_name_2.pot
        /es.po
        /pt.po
git integration attempt
/var/lib/pootle/olpctest
  /templates
     /symlink project_name_1.pot -> /d.l.o-clone/project_name_1/po/project_name_1.pot
     /symlink project_name_2.pot -> /d.l.o-clone/project_name_2/po/project_name_2.pot
  /es
     /symlink project_name_1.po -> /d.l.o-clone/project_name_1/po/es.po
     /symlink project_name_2.po -> /d.l.o-clone/project_name_2/po/es.po
  /pt
     /symlink project_name_1.po -> /d.l.o-clone/project_name_1/po/pt.po
     /symlink project_name_2.po -> /d.l.o-clone/project_name_2/po/pt.po

This approach FAILS because in the current git support backport of Pootle verifies if the directory belongs to a version control system and this is not our case as the /es, /pt, etc. directories are clearly and purposefully left in the Pootle hierarchy.

Apparently, the layout of the OLPC's git follows what the Pootle documentation refers to as a GNU layout (in the /etc/pootle/pootle.prefs file):

 # pootle.podirectory
 #
 # All projects are stored in this directory in this layout:
 #   $podirectory/$project/$language
 # Projects can also be stored according to the GNU convention with one PO file
 # per language, and all files in one directory.
 podirectory = "/var/lib/pootle/"

But we haven't managed to determine when, where and how this can be controled but it may be just the ticket that we are looking for.

Defining & using terminology

see Pootle::Terminology
see Pootle::Matching
see Sourceforge.net::Creating glossaries

This is an extremely helpful functionality, particularly if we want to open the translation process to non-professional and #Opportunistic translators because it aids them in preserving the terminology that has been decided upon by the more involved segments of the community (ie: will the olpc.es translate "cat" as "gato" or "mish"?)

The structure of the terminology file is a standard .PO file (ie: msgid / msgstr). And (apparently) many could co-exist (ie: color.po, hig.po, etc. There are ways need to explore on how to override the default terminology searches per project.

Although the documentation is not crystal clear on where and how these terminology files would go, a simple test has been made and works.

The steps, as an administrator, were:

  1. Enable the languages for which you want to have terminology:
  2. Join the project

You are 'done'. When later translating anything in the 'spanish' branch, the xavitest.po will be used to propose translations in a dynamic way.

really done? Need to verify process

Taking advantage of translation memory

see Pootle::TM

Very much like terminology but instead of working at a word level, the matching is performed at the whole string proposing full already used translations, saving some time but more importantly providing some level of consistency between translations. On the other hand, instead of being a dynamic feature, this is more of a batch or static process. Starting from a base translation memory file, one may generate suggestions for a specific file or a whole set of them.

pre-test notes 
This are some ideas, doubts and why-not stuff related to the reading of the documentation.
File per file is probably not desirable (ie: use one PO to generate a memory for another PO), so some sort of 'olpc' translation memory file (per language) should be made.
How do you generate the initial all-encompassing tm file?
Every time a new language or PO is added, the associated memories for them (as targets) should be (automatically?) created...
Updating a POT (iow, a new version) should trigger the update of all languages.
The updating (which is not done in real-time) would still need to be done in a reasonable time period.

User permissions

see Pootle::Permissions

Permissions are granted or revoked within a specific language-project pair, and optionally, within that scope, to specific users. This means that the nobody & default users have local, instead of global, behaviors.

The following is the list of the available permissions handled by Pootle. The actual description is a result of observation and deduction (haven't found a specific documentation on them), so they are to be considered with #, fuzzy tag...

NOTE: The system administrator flag is not reachable through the GUI, as it resides as a flag (rights.siteadmin) in the user.prefs file. This permission allows the users possesing it to administrate all the projects, languages and functionality (you should always have one user with it).

Permission Description Default users Comments
View Allows the browsing of the PO files and their translation nobody, default
Suggest Allows to suggest a translation default this should be enabled for nobody if we want to make things simpler for the #Opportunistic translator.
Translate Allows to submit a translation. none By default there are no users allowed to translate forcing the administrator to grant this right—bureaucratic and restrictive.
Overwrite When uploading a PO file, it allows the user to overwrite any existing file (iow, no merge of changes) none Handle with care.
Review Allows to approve/reject suggestions made by users capable of suggesting translations. none see #Reviewer
Archive Would allow the user to download the a set of PO files in zip format. Probably those of a language/project. none
Compile PO files Allows an user to generate / compile the PO into MO files. none It is unknown where the MO files will reside, or how they will be transfered or made available. need to explore
Assign Allows an user to assign files (or chunks of files) amongst users that have been granted the translate permission. none need to explore
Administrate It's granularity is not well defined (or understood): administrate everything (seems like it), or just a project? or just a language? or just a language-in-a-project? none need to explore
Commit The holy grail or point of all this: make a translation available. none Again, the granularity is not well defined or understood. need to explore

Advanced site administration

Lowering latency through web servers 
see Pootle::Apache
see Pootle::NginX
Translator statistics 
see Pootle::LogStats
Configurable logos 
see Pootle::Changes
Removing / Renaming files 
According to user friedel in IRC#pootle, you can 'safely' delete a file (and associates—ie: xyzzy.es.po.pending) without much to worry about. This has to be done outside of Pootle.

File structure

F/D+Type File/dir Purpose Notes
D-auto .../po root of the Pootle files
D-auto .../po/terminology/xx the directory where terminology files are stored on a per language basis (normal, but minimalistic, PO files).
Given that POTs are manually injected and updated, maybe the terminology directory is polymorphic so a templates language directory could be used — enabling to have system-wide terminology
D-gui .../po/project each 'project' has its own root
D-manual .../po/project/templates the directory where the POTs associated with the project must go (not accessible through the web-GUI)
D-gui .../po/project/xx within each 'project' a directory is set up for each language (using ISO 639 coding). Created when a language is added to the project. unknown how to remove it—probably just deletion)
F-auto .../po/project/xx/pootle-project-xx.prefs for each [language+project] holds the rights (for nobody, default & specific users) and the goals (name and list of files),
F-auto .../po/project/xx/pootle-project-xx.stats holds for each PO file six numbers (pressumably translated words, translated strings, non-translated words, non-translated string, total words, and total strings)
F-gui/rcs somefile.po the actual PO file.
F-auto somefile.po.pending if suggestions have been made, they are stored here in a pseudo-PO format that modifies the msgid (appends a "_: suggested by username\n" line). See #Bug dealing with Suggestions.
F-auto somefile.po.stats statistics about the internals of the file, apparently only about the checks and their results.
F-manual somefile.po.tm the results of applying a translation memory to that file, pressumably the suggested string translations.

To Do

Usage

  • Merge of uploaded PO
  • Check if there any kind of verification when a PO file is uploaded to ensure the up-to-dateness of the original POT on which it is based? (ie: a PO is loaded but the POT on which it is based is a an older version)
  • Update from templates (this needs the git interface on one side, but also if a POT is updated by whichever means, the PO should be merged/solved)
  • Define a workflow! — the #Basic Scenarios above just cover the translator part. After the git interface is tested, the handling of the developer input, and the Pootle output must be defined.

Config

  • Interface with GIT — both on importing POTs & POs from it & exporting POs back to it.
  • For completion's sake, language plurals & associated formulas must be defined (a bit pointless if developers don't actually do i18n right though, but...)

setting up

Please note that we are grouping the projects found in d.l.o into two base projects for localization: core & bundled. Other projects that are not found in the builds, will have a case-by-case decision on where to included them (either in an independent pootle-project, or grouped with other 'extras'). Given this grouping, we must ensure that no files will conflict in their naming, so the first approach is to prefix their POT filenames with their project name, ie: the Chat.pot of the chat-activity will be chat-activity.Chat.pot in Pootle. This in turn means that all the .PO files will be named chat-activity.Chat.po (but in different per-language subdirectories).

  1. First obtain a local clone in the Pootle server machine of the project(s) that needs to be included (see Pootle/Files for some notes).
    • Create the appropriate pootle-project for the git-project
    • Copy the .POT file from the git clone into the /templates directory of the pootle-project
      (may need to create this directory as a subdirectory of the pootle-project directory)
    • Associate the initial languages (am, ar, en, es, fr, ha, hi, ig, ne, pt, ro, ru, rw, th, ur, & yo)
    • Select the languages and perform an Update from templates in order to generate the initial (empty) PO files for each language
  2. If the project includes pre-existing PO files, they need to be merged and included:
    1. merge previous work by executing
      msgcat --use-first -o /__git_path__/xx_YY.po /var/lib/pootle/__project__/xx_YY/__project__.__foo__.po /__git_path__/xx_YY.po
      this takes the initial clean Pootle generated PO file and populates it with the msgstr from the git PO file; while it also ensures that Pootle's basic and standard header is preserved (more specifically the timestamp of the source POT file)
    2. copy the file back into Pootle with:
      cp /__git_path__/xx_YY.po /var/lib/pootle/__project__/xx_YY/__project__.__foo__.po
    3. delete (the now obsolete) Pootle statistics (that will be re-generated on-demand):
      rm /var/lib/pootle/__project__/xx_YY/__project__.__foo__.po.stats

Done

  • user creation (both through the admin & registration interface that uses confirmation codes via mail)
  • add / remove languages (basically reduced the set to green countries languages in order to focus attention)
  • add project (only one for 'olpc' — deleting is performed through the GUI.
  • associate languages & projects
  • define permissions for nobody, default and specific user
  • upload PO file (note: it doesn't verify against the templates, so you may end up uploading anything anywhere... handle with care)
  • translate on line — may be a bit slow although it could be the browser, connection or server (see #Advanced site administration)
  • used the suggest funcionality, hit bug with review (see #Bug dealing with Suggestions)
  • Define goals
  • Assign translations
  • Delete files (this is done outside of Pootle directly on them... some things will protest, but is the 'accepted procedure')

Glitches

  • Something related to encodings was wrong during the setup
    Alfonso commented several language specs that apparently were iso8859 instead of UTF ??

Bug dealing with Suggestions

There seems to be something awry with the suggestions... somehow there seems to be a mismatch between the msgid for which the suggestion was made, and the msgid displayed in the review process. IOW, you suggest foo as a translation of bar, but in the review process it will show it as a suggestion for xyzzy!! review problem!

Talking in IRC#pootle, the following emerged: the mechanism used to associate a given suggestion is based on the #: comments (usually documenting the source code line where the string is located). This mechanism fails when there are more than one strings extracted from the same source line.

Possible solutions & workarounds 
the problem is relatively serious as it inhibits the #Opportunistic translator (or any other translator) from making suggestions (as they may get lost).
  • modify the source code to avoid having more than one gettext string per line.
  • modify the POT to avoid duplicate comments (iow, a manual process)
  • disable the suggestions
    • explore the possibility of having the offending POT in another project where suggestions are not allowed (thus avoid reviewers get bitten by it).
  • patch Pootle (or file a bug)

See also