Dictionaries and Copyright
It looks as if I will be the first to post on this talk page. This is appropriate as the subject is quite fundamental to the OLPC dictionary project and where it is (or is not) heading.
The list of dictionaries available for StarDict is quite short. This is roughly the list of dictionaries that have been available in the Free Software world for the last decade or so. Why is it that this list has not grown over the last decade?
As I brought up on IRC, no current copyright law could govern a work created before 1900, and there is little or no doubt that nearly every language in existence had dictionaries before then. The first Chinese dictionaries were created around 100AD. European dictionaries started appearing some 3 to 5 hundred years ago. During the 1800's missionaries compiled dictionaries of many, many languages. Copyright was not automagic back then, either.
The only reasonable conclusion is that there are Public Domain dictionaries in nearly every conceivable language, and that these dictionaries merely need to be digitized in order to be used. Many of these dictionaries use the Latin alphabet which is well supported by many Free and Non-Free OCR programs. Why are we, then, limited to a few dictionaries in English and wordlists of similar terms in a handful of languages?
I hope for the sake of the OLPC project, that this issue can be quickly resolved.
-- LuYu, February 2008
An interesting feature would be to link connotations of a word A to specific connotations of another word B in the same language or in another language. Many dictionaries leave the user with a superset of the connotations of all possible translations instead, which has to be reduced by ruling out false translations. To be able to link to a specific connotation by name connotations would have to be named and (preferrably) allow markup for existing and non-existing links, like internal wikipedia links. [OmegaWiki] --Fasten 07:09, 29 February 2008 (EST)
A Wikifier for Browse could cooperate with the dictionary software and collect a database of personal vocabulary. The browse application could also allow to highlight all previously memorized words on a page, which would allow a reader to verify if a word should be known before looking it up. The database could also store the occasion (document, page) where a word had (first) been looked up, which might help to memorize the word based on the context. --Fasten 07:36, 29 February 2008 (EST)
 is a collection of dictionaries created by Freelang. Their license allows for redistribution, or modification, but not both, and is phrased (in English) in terms of French copyright law. 
IANAL, but here are some snippets from the  and some opinions to go along with them:
The lists of words are the property of their authors, they are not in the public domain.
The totality of the pages constituting the Freelang site (http://www.freelang.net) is the original work of Beaumont, with the exception of the contributions made available by users, which remain their property.
Article 2. The Freelang dictionary is a freeware program. The program and the lists remain the property of their authors, and therefore must not be considered to be in the public domain. The special feature of the Freelang dictionary is that any user may become an author of or contributor to a list, and obtain rights to this list.
Freelang is attempting to claim that the authors of the wordlists "own" the wordlists. Evidently, some sort of attribution is given in the .exe archives as there is no evidence of it on the website itself. This means, in theory at least, that every wordlist author would have to be contacted and asked to contribute their work to the OLPC dictionary project.
The program and the lists of words must however be considered as an unbreakable totality, inasmuch as the program is necessary for the display of the lists, and the lists are necessary for the use of the program.
Here, they are claiming something very strange. It appears that they are saying even though the individual authors are the "owners" of the wordlists, the wordlists are an inseparable part of the Freelang program and therefore subject to their rules. This most likely has to be one way or the other: the wordlists are under the copyright jurisdiction of their authors or of Freelang. If the wordlist creators have the copyright, Freelang has no rights whatsoever over their works. If on the other hand the wordlists which were generated by the Freelang program are considered to be extensions of the program, the authors have no copyright. It is interesting that Freelang would try to claim that both are true.
Article 3. As a freeware program, the Freelang dictionary may be freely distributed, provided that the files constituting it are not modified. You may therefore:
- download and use it free of charge, in a personal or professional capacity;
- make copies of it and transmit them freely;
- modify the lists of words for your convenience, using the functions for addition, deletion or modification of the program;
- create new lists of words, using the function of the program that permits this.
So, does this mean that one cannot distribute modifications of the files made with the Freelang program or that one cannot distribute modifications period?
Article 4. Lists modified or created by users may not be distributed publicly (notably over the Internet) without the consent of Freelang. A private transmission or one done in a professional setting (business, school, university...) is on the other hand authorized.
If the lists are the "property" of the users, then why can user created lists not be distributed independently of the program? Either, the user is the author or Freelang is the author. Which is it?
Article 5. The Freelang dictionary may be distributed at a Web site other that that of Freelang if the following conditions are respected:
. . .
- the totality of the dictionary is distributed, that is, the program and at a minimum one list of words;
Article 9. It is likewise prohibited:
. . .
- to break up the program and the lists of words;
This is a bit of a contradiction. One has to distribute the program and at least one wordlist, but one cannot break up the program and the wordlists. So, is it not necessary, then, to include all wordlists?
There are several other contradictions I would like to point out, but I think that is pretty much the meat of it. These claims are both contradictory and in many cases unintelligible. However, IANAL.
Judging from the general tone of the thing, though, I think these people might be rather disposed to allow OLPC to use all of their wordlists if asked. On their FAQ (http://www.freelang.net/dictionary/dic-faq.html), they mentioned allowing Mac users to write their own display software. I really wish they had used a Creative Commons Attribution Non-commercial license, though. It would make everything much easier.
Is anyone willing to write them a letter?
--LuYu 03:39, 18 March 2008 (EDT)
Restoring spelling in 8.2.0
There are no spelling dictionary files in 8.2.0. ticket #5394 and ticket #6099 suggest Write and Browse used to spell check but did not have a context menu for alternatives. In 8.2.0, neither highlights mis-spelled words.
I don't know if it's intentional the feature went away. People persuasively argue that red squiggly lines under words is not the best way to teach kids how to spell. Anyway, but here's one way to bring back spelling:
Google refused to cough up the official location in Fedora 9 for system dictionaries. So
% strace enchant -l Couldn't create a dictionary for en_US.UTF-8
reveals all the directories enchant searches for dictionaries.
I picked /usr/share/enchant/myspell. You can download new Hunspell dictionaries from the net (e.g. http://wiki.services.openoffice.org/wiki/Dictionaries ), but if you have the Firefox version 6 activity installed, you already have them. Here's one way to link them to where enchant looks for them.
su - mkdir /usr/share/enchant/myspell ln -s ~olpc/Activities/Firefox-6.activity/dictionaries/en-US.dic /usr/share/enchant/myspell/en_US.dic ln -s ~olpc/Activities/Firefox-6.activity/dictionaries/en-US.aff /usr/share/enchant/myspell/en_US.aff
After this change:
% enchant -l (spell checks!)
also creates empty ~olpc/.config/enchant/en_US.dic and en_US.aff files; the first is where you can add your own words.
Squiggly lines appear under unrecognized words in Write! Though there's no UI to offer corrections, ticket #5394. And Write doesn't seem to access your local dictionary.
I'm not sure how to do the same for Browse
-- skierpage 11:22, 8 December 2008 (UTC)