Projects/Automatic translation software

Moses toolkit

  http://www.statmt.org/moses

 1. Run a virtual machine of the OLPC (If you have an actual OLPC, forget this step). I personally use Virtual Box (by Sun/Oracle, it's free and open source)
        http://www.virtualbox.org/
     There are prepackaged VM files located here
        http://dev.laptop.org/pub/virtualbox/
     I use version 656 because I have an OLPC of the same version, but take your pick
 2. Download the model file into the olpc
        http://groups.inf.ed.ac.uk/hoang/hieu/olpc/en-ht.tgz
    and the decoder
       http://groups.inf.ed.ac.uk/hoang/hieu/olpc/moses

    This may be a bit tricky as the Web browser on the OLPC takes some getting used to. You can install wget 
      su root
      yumm install wget

 3. Make the moses decoder executable
      chmod +x moses

 3. Unzip the file and cd into the directory
       tar zxf en-ht.tgz
       cd model
  4. Run the decoder, wait for 30 secs
       ../moses -f moses.ini
      until the prompt:
         Created input-output object : [1.000] seconds

  5. Type in some English and watch it translate (here, into Haitian creole)
          i am a doctor
          Translating: i am a doctor 

          reading bin ttable
          size of OFF_T 8
          binary phrasefile loaded, default OFF_T: -1
          Collecting options took 0.200 seconds
          Search took 0.280 seconds
          BEST TRANSLATION: mwen se yon doktè [1111]  [total=-0.520] <<0.000, -4.000, 0.000, -0.511, 0.000, 0.000, 0.000, 0.000, 0.000, -16.452, 0.000, -5.911, -0.693, -3.622, 1.000>>
          mwen se yon doktè 
          Translation took 0.280 seconds
          Finished translating

  http://www.statmt.org/matrix/

 1. Create a client-server application which will run the resource intensive application on a server. Clients will be a Web browser or a custom Pythong app.
     - Skills required: Python, Apache, C++
 2. Fork the decoder source code to enable it to run on the OLPC. Minimize memory consumption, discard code not likely to be used by the application. 
     - Skills required: C++
 3. Minimize the work the decoder has to do by using a greedy search instead of a beam search, or have a very tight beam and other threshold.
     - Skills required: C++, statistical machine translation

 4. Different language pairs
 5. Speech-to-speech translation
 6. Integrating Optical Character Recognition (OCR) with translation
 7. Enable sharing of user vocabulary via the OLPC Mesh network
 8. Distributed training of data on the OLPC

  http://www.statmt.org/moses/

  Moses Support

 430Mhz CPU. AMD geode x86 processor
 237MB RAM
 1GB flash disk
 Linux OS

Projects/Automatic translation software

Contents

Getting started

Mission Statement and Objectives

Project Ideas

Progress

12th march 2009

3rd May 2009

31st May 2009

12th Jully, 2009

30th April 2010

Credits

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

About OLPC

About the laptop

About the tablet

Projects

OLPC wiki

Tools