Dictionaries: Difference between revisions
No edit summary |
m (→Bundled dictionaries: markup typo) |
||
(10 intermediate revisions by 4 users not shown) | |||
Line 2: | Line 2: | ||
== Bundled dictionaries == |
== Bundled dictionaries == |
||
Some of the core software already supports dictionaries, primarily for spell checking: XULRunner thus [[Browse]] uses hunspell, Abiword thus [[Write]] uses enchant. |
|||
<trac>6104</trac> is to unify the dictionaries, probably so they all use hunspell; this is an issue for the underlying [[Fedora]] software distribution. |
|||
: In [[Release notes/8.2.0|Release 8.2.0]], it seems neither Browse nor Write has spell checking, and neither has a dictionary. Only the [[Firefox]] activity has spell checking, using its own local <tt>dictionaries</tt> directory. See the [[Talk:{{PAGENAME}}]] discussion page for a workaround. -- [[User:Skierpage|skierpage]] 10:00, 6 December 2008 (UTC) |
|||
The eSpeak text-to-speech library has a special dictionary. |
|||
== Other dictionaries == |
|||
<tt>dicts.info</tt> has a bundled dictionary developed by [[user:ZdenekBroz|Zdenek Broz]], in Ar/En/Es/Fr/Pt/Ro/Ru . It has definitions in English from Princeton's WordNet; but not all are appropriate for children. It has pictures from the dicts.info picture dictionary, which are all free, but without source... we are looking for well-sourced free images to replace these. |
<tt>dicts.info</tt> has a bundled dictionary developed by [[user:ZdenekBroz|Zdenek Broz]], in Ar/En/Es/Fr/Pt/Ro/Ru . It has definitions in English from Princeton's WordNet; but not all are appropriate for children. It has pictures from the dicts.info picture dictionary, which are all free, but without source... we are looking for well-sourced free images to replace these. |
||
== Other dictionaries for OLPC == |
== Other dictionaries for OLPC == |
||
Here is a set of 2500-word dictionaries, for use by OLPC and any of our projects, but not currently under one of our approved [[licenses]]: |
Here is a set of 2500-word dictionaries, for use by OLPC and any of our projects, but not currently under one of our approved [[licenses]], thanks to '''[[Babylon]]''': |
||
<div class="plainlinks"> |
<div class="plainlinks"> |
||
Line 19: | Line 28: | ||
* [[http://www.wiktionary.org/ Wiktionary]] |
* [[http://www.wiktionary.org/ Wiktionary]] |
||
* Yahoo list of [http://dir.yahoo.com/Reference/Dictionaries/Language/ online dictionaries by language] |
* Yahoo list of [http://dir.yahoo.com/Reference/Dictionaries/Language/ online dictionaries by language] |
||
* [[OmegaWiki]] |
* [[OmegaWiki]] see also http://www.omegawiki.org/OLPC |
||
* [http://hindunet.org/saraswati/html/indlexmain.htm Kalyanaraman dictionary/semantic concordance of 90+ languages] |
* [http://hindunet.org/saraswati/html/indlexmain.htm Kalyanaraman dictionary/semantic concordance of 90+ languages] |
||
Line 30: | Line 39: | ||
[http://wiki.webz.cz/dict/] is a set of translation pairs extracted from wiktionary. They are all of the form English-X. Some, like Spanish and Portuguese, are quite extensive (8,000-10,000 word pairs); others, like Urdu and Nepali, are very small. Quality is unknown. They are now over a year old, so it may be worthwhile to ask the author for the scripts and rerun them ourselves. These dictionaries are definitely distributed under an acceptable license. |
[http://wiki.webz.cz/dict/] is a set of translation pairs extracted from wiktionary. They are all of the form English-X. Some, like Spanish and Portuguese, are quite extensive (8,000-10,000 word pairs); others, like Urdu and Nepali, are very small. Quality is unknown. They are now over a year old, so it may be worthwhile to ask the author for the scripts and rerun them ourselves. These dictionaries are definitely distributed under an acceptable license. |
||
[http://sourceforge.net/project/showfiles.php?group_id=67446&package_id=66108] is the dictionary from Pythonol, a python program whose intent is to help English speakers learn Spanish. The dictionary appears to be very complete (>70000 word pairs). It is exclusively English-Spanish. The dictionary appears to be licensed under a one-off license intended for software, based on the GPL but with some unusual "anti-profit" restrictions. The license does permit redistribution with modification under copyleft-like terms, so it is likely acceptable, if unpalatable. |
|||
==Free Software using dictionaries == |
==Free Software using dictionaries == |
||
Line 68: | Line 79: | ||
* stardict English/Chinese dictionary |
* stardict English/Chinese dictionary |
||
[[Category: |
[[Category:Reference]] |
||
[[Category:Languages (international)]] |
[[Category:Languages (international)]] |
||
{{merge}} |
{{merge}} |
Latest revision as of 12:05, 8 December 2008
There are monolingual and bilingual dictionaries for a remarkable number of languages available on the Web or as Free Software. You are invited to create one for your language, or to contribute to an existing project. Also thesauri, dictionaries of specialized terms (medicine, Net Jargon, etc.). Dictionaries for Input methods, text-to-speech and speech recognition are not listed here. There are also tools for creating dictionaries of various kinds to use with various software, including vocabulary drill for language study.
Bundled dictionaries
Some of the core software already supports dictionaries, primarily for spell checking: XULRunner thus Browse uses hunspell, Abiword thus Write uses enchant. <trac>6104</trac> is to unify the dictionaries, probably so they all use hunspell; this is an issue for the underlying Fedora software distribution.
- In Release 8.2.0, it seems neither Browse nor Write has spell checking, and neither has a dictionary. Only the Firefox activity has spell checking, using its own local dictionaries directory. See the Talk:Dictionaries discussion page for a workaround. -- skierpage 10:00, 6 December 2008 (UTC)
The eSpeak text-to-speech library has a special dictionary.
Other dictionaries
dicts.info has a bundled dictionary developed by Zdenek Broz, in Ar/En/Es/Fr/Pt/Ro/Ru . It has definitions in English from Princeton's WordNet; but not all are appropriate for children. It has pictures from the dicts.info picture dictionary, which are all free, but without source... we are looking for well-sourced free images to replace these.
Other dictionaries for OLPC
Here is a set of 2500-word dictionaries, for use by OLPC and any of our projects, but not currently under one of our approved licenses, thanks to Babylon:
Turkish.xls | Arabic.xls | Chinese (S).xls | Chinese (T).xls | Dutch.xls French.xls | German.xls | Greek.xls Hebrew.xls | Japanese.xls] | Korean.xls Portuguese.xls | Russian.xls | Spanish.xls
Web
- [Wiktionary]
- Yahoo list of online dictionaries by language
- OmegaWiki see also http://www.omegawiki.org/OLPC
- Kalyanaraman dictionary/semantic concordance of 90+ languages
Potentially appropriately licensed dictionary files available online
[1] is a collection of many dictionaries, including a Universal Dictionary that provides word-to-word-to-word mappings for many languages, with thousands of words. This dictionary doesn't appear to be licensed in a way that allows free redistribution, but the authors objection to it is due to the potential for users to be stuck with out-of-date dictionaries. Perhaps they would relicense if OLPC asked nicely
[2] is a collection of dictionaries created by Freelang. Their license allows for redistribution, or modification, but not both, and is phrased (in English) in terms of French copyright law. [3]
[4] is Ergane, a program designed to promote Esperanto by providing a translation system that uses Esperanto as the common intermediate index language. It advertises its wordlists as being "free of copyright and can be copied, distributed and changed without legal restrictions. You can use them in any way you like, even for commercial purposes!". The wordlists are generally of vintage 2004-6, and their quality is unknown. Translating via Esperanto may or may not be very effective.
[5] is a set of translation pairs extracted from wiktionary. They are all of the form English-X. Some, like Spanish and Portuguese, are quite extensive (8,000-10,000 word pairs); others, like Urdu and Nepali, are very small. Quality is unknown. They are now over a year old, so it may be worthwhile to ask the author for the scripts and rerun them ourselves. These dictionaries are definitely distributed under an acceptable license.
[6] is the dictionary from Pythonol, a python program whose intent is to help English speakers learn Spanish. The dictionary appears to be very complete (>70000 word pairs). It is exclusively English-Spanish. The dictionary appears to be licensed under a one-off license intended for software, based on the GPL but with some unusual "anti-profit" restrictions. The license does permit redistribution with modification under copyleft-like terms, so it is likely acceptable, if unpalatable.
Free Software using dictionaries
StarDict and viewers
We are using StarDict as our default dictionary viewer. It is fine for displaying a language with definitions, or two languages with translations. It is not yet good at displaying many languages at once in a space-efficient way. (The desired use is a database with 40 languages, and 40^2 views of source-target language with words in the source and definitions in the target... with each words and definition appearing exactly once in each language.)
This list is meant to indicate the range of software available. It is in no way complete.
Typing
Aneto O. was working on a typewriter activity that never quite made a fully functional activity bundle. Some new work is starting in this area, as of December 2007.
Spelling
- Debian Junior Writing (editors and spelling checker)
- aspell about 40 languages
- ispell about 35 languages
- myspell about 40 languages
Dictionary servers
- dict more than 50 languages
- Serpento dict server with full Unicode support
Other
- dict-moby-thesaurus Moby Thesaurus
- dict-bouvier English legal dictionary for US
- dict-foldoc Free OnLine dictionary of computing terms
- dict-vera Computer acronyms
- The On-Line Hacker Jargon File, version 4.4.4
- leksbot Botany and biology
- rhyme
Chinese / Kanji characters
- giten Japanese Kanji dictionary
- kanjidic Japanese Kanji dictionary
- kiten Japanese Kanji dictionary
- hanzim Chinese dictionary
- pydict English/Chinese dictionary
- stardict English/Chinese dictionary