TinyLanguage

From OLPC
Jump to: navigation, search

TinyLanguage is a comprehension-based language learning program for low cost computing devices, including the XO. A Knight Foundation grant application has been submitted by Scott Van Den Plas.

Why TinyLanguage?

Right now, English is the lingua franca of the Internet, but multilingualism is the future. English enjoys a lead position on content, but the majority of Internet users do not speak English or speak English as a second language. We are introducing connectivity to the third world, where there may or may not be a formalized language program. From the Technology Trends section of this document (http://www.britishcouncil.org/learning-research-english-next.pdf):

  1. Technology is enabling new patterns of communication in ways which have implications for language patterns.
  2. Anglo-centric technological limitations are largely overcome, allowing practically any language or script to be used on the internet or in computer software.
  3. As English becomes used more widely as a language of international reach, a greater diversity of viewpoints are represented.
  4. Other world languages, such as Spanish, French and Arabic, are also being adopted by the new media.
  5. Lesser-used languages are flourishing on the internet.

So the question became how can we reduce the language interference that is bound to become more of an issue... or how do we teach a large group of people a foreign language without knowing their native tongue?

Creating instruction in the foreign language to be given in the native language is time consuming, and requires a teacher to be present for translation. It is possible to teach language in another method, focusing on comprehension of the language first (which is incidentally our primary goal). We can great a language game, populated with creative commons image and audio content, to provide fairly advanced language training.

Description

Imagine I showed you a series of three pictures. The first picture is a cat about to jump onto a chair. The second picture is the cat jumping onto the chair in midair. The third is a picture of the cat sitting on top of the chair after the jump. For each of these picture, I tag them with the following written and spoken phrases "The cat will jump," "The cat is jumping," and "The cat has jumped." We show the student the picture and associated phrases so they know what phrase goes with what picture. We then have the student match which phrase goes with which image. We have demonstrated via this method both the noun for cat and the verb for jump (as well as simple conjugation).

We can create a number of independent simple games like this and string them together. We then assign a rough difficulty value to each game. Based on the performance of students on each individual game, we can adjust the scale of difficulty for that game. As students do better on a difficulty range, we move them automatically up to more difficult games. Ideally, this would be done in English first, and then move on to other languages. It would also be great to see the target countries turn around and contribute some of their own native language content.

How to do it

So I want to build a massive library of creative commons images grouped together and tagged with spoken and written phrases. Then I want to use this data to make a free and open language program targeting third world children.

There are some interesting things we may be able to do from a computer science angle with this sort of image data mapped to all sorts of natural language descriptors. I also think there are possibly other ways to use this sort of system to provide an all around education. That document I referenced earlier has a section on content and language integrated learning that might be informative.

General principles

  1. Make the application free and open, but more importantly, make the supporting data set free and open. Use creative commons licensing to protect yourself, but do not restrict any projects cascading from this. We want to teach, first and foremost.
  2. Make it new and better, do not emulate the past. Think about ways to teach that have not been tried yet.
  3. Simplicity wins. Complexity in language is a massive entry barrier for a student. Make sure that the first activities students try do not turn them off.
  4. Target more than OLPC. We all love OLPC, but this is an emerging market. We do not know and cannot predict who will win this race, but it would be a mistake to write this kind of application for only one device.

Lastly, comprehension based learning is terrific for teaching vocabulary, but is terrible at teaching grammar and conversational interaction. The focus of something like this is strictly comprehension, the first step in learning a new language. You will not walk away from this program fluent in a foreign language, but you will walk away much more ready to learn. Understand first, then speak.

Similar projects

Interested people

  • Meg Welter from IMSA is doing research on the linguistics portion as part of an independent study.
  • Mel Chua would like to do a code sprint with other developers of this project, and can also help make a Mandarin version once the English version is done.