Projects/OLPC ALBANET

From OLPC
< Projects
Revision as of 21:39, 16 March 2009 by Eruci (talk | contribs) (NLP tools for OLPC, Meaning to Word Multi-lingual Dictionary [Ervin Ruci, ALBANIA])
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

1. Project Title & Shipment Detail Name of Project: Cross-Lingual Meaning to Word dictionary & Collection of NLP Tools Shipping Address You've Verified: Att: Ervin Ruci, Universiteti i Vlores, Departamenti i Shkencave Kompjuterike dhe Inxhinjerise Elektrike, Sheshi Pavaresia, Vlore, Albania. Phone: +355 – 692035216

Number of Laptops You Request to Borrow: 20 Loan Length—How Many Months: 18 2. Team Participants (In list form) Name(s) & Contact Info: (include email addresses & phone numbers) 1. Ervin Ruci, eruci@univlora.edu.al, +355-692035216, http://www.univlora.edu.al/personel/eruci Address: L. Partizani, Rr Drashovica, Nr 48, Vlore, Albania. Past Experience/Qualifications: Webmaster, Mount Allison University, Sackville, NB (1997-2000) Applications Developer, CIRA (Canadian Internet Registration Authority, Ottawa, ON (2001-2005) Founder, Geocoder.ca, a free geocoding solution for North America (2006) Education: Mount Allison University (Computer Science), Carleton University Graduate Studies (MSC Computational Geometry) Current Employer and/or School: University of Vlora, Department of Computer Science 2. Tanush Shaska, shaska@univlora.edu.al, shaska@okaland.edu, http://www.albmath.org/users/shaska/index.html Address: 546 Science and Engineering Bld. Department of Mathematics and Statistics Oakland University Rochester, MI, 48309-4485, USA Phone: 248-370-3436 Past Experience/Qualifications: Summer 07 Visiting Professor, Dep. Computer Science, Maria Curie-Sklodowska Univ., Lublin, Poland 2003-05 Assistant Professor of Mathematics, Department of Mathematics, University of Idaho 2001-03 Visiting Assistant Professor of Mathematics, University of California at Irvine 2000 Deutsche Forschungsgemeinschaft Fellow, Dep. of Mathematics, Univ. of Erlangen, Germany. Currently: Professor, University of Vlora and Assistant Professor Oakland University Education: PHD, University of Florida 3. Eustrat Zhupa, ezhupa@univlora.edu.al http://www.univlora.edu.al/personel/ezhupa Address: University of Vlora, Faculty of Sciences, Vlore, Albania Past Experience/Qualifications: Lecturer, University of Bari, Italy (2004-2008). Currently: Professor, University of Vlora. Dean of the Faculty of Sciences, University of Vlora Education: PHD, University of Bari.


3. Objectives Project Objectives: (please list specific, measurable objectives for your project) 1. Develop a cross-lingual natural language processing system with easily plug-and-extend functionality based on the Global Wordnet project. The software will enable a user to define a particular word or concept in their own language and obtain the word that matches their definition in any language installed in the software's knowledge base. A sample application illustrating this as proof of concept can be found on the web at : http://fjalor.kerkoje.com 2. Develop tools for extending and improving knowledge bases such as the Albanet project : http://albanet.univlora.edu.al and other wordnets in the user's native language. All this input will be aggregated in a centralized web service that will keep track of changes and extensions of the knowledge bases in this collaborative effort to improve the quality of the Global Wordnet. 3. Make all code and databases well documented and develop a SVN repository for tracking changes to the software by the community. [edit] 4. Plan of Action (One or more paragraphs) We will direct our students to modify and adapt the current software in a standalone version for the XO Laptop platform, then extend its current functionality to develop this application into a collaborative Wordnet extension platform. Plan and Procedure for Achieving the Stated Objectives: 1. Develop the documentation and the basic technical design for the system 2. Divide the coding tasks between 20 of our best students. 3. Integrate and streamline all work done into a single standalone application that will mostly work in off-line mode, but sync the data changes to a central repository whenever online.


[edit] 5. Needs:

Linguistic tools are an important educational tool in the under-developed

world. These tools will make global knowledge more accessible to all, regardless of the language this knowledge is compiled under. Locally? This project will collect local knowledge bases to create a network of interconnected concepts across different languages and dialects. In the greater OLPC/Sugar community? This project will provide a software platform that can be extended and used in other tasks as well, such as Information Retrieval, Publishing and Sharing Creative work. Outside the community? Will invite greater participation in collaborative linguistic knowledge bases development by making the software available to other platforms/environments. Why can't this project be done in emulation using non-XO machines? We wish to use the lowest end possible machines, so as to make sure that all interface functions behave properly inviting greater participation from students in the underdeveloped world who have the creative energy but not the tools to participate in large Natural Language Processing collaborative development efforts. Why are you requesting the number of machines you are asking for? We need one laptop for each student who will be working on the project. We will consider salvaged/rebuilt and/or damaged XO laptops as we are looking to make our software function in the lowest common denominator, and our students will gain even greater skill in facing the extra challenge of fixing and reconfiguring XO laptops that are in not near optimal shape. Will you consider (1) salvaged/rebuilt or (2) damaged XO Laptops?


[edit] 6. Sharing Deliverables:

Project URL: http://albanet.univlora.edu.al How will you convey tentative ideas & results back to the OLPC/Sugar community, prior to completion? All results will be posted on this website in a quarterly basis. How will the final fruits of your labor be distributed to children or community members worldwide? The final package will be distributed as a single self installable software package tested and verified to work properly on any XO-laptop. Will your work have any possible application or use outside our community? Our work will have many possible applications outside the XO community, especially in the areas of cross-lingual named entity extraction, and cross-lingual information retrieval, both areas of current active research. If yes, how will these people be reached? We are part of the Global wordnet project (http://globalwordnet.org/), and we will announce the progress of our work through regular contacts with the Global wordnet community. Have you investigated working with nearby XO Lending Libraries or Project Groups? There are no nearby XO Lending libraries we can rely on at the moment, it seems like Albania is off the map when it comes to the existence of such support groups.

[edit] 7. 1. Our project will benefit from testing and documentation efforts of a wide range of people and will achieve its true goals only after it has been widely distributed to the community. 2. Teachers (especially foreign language teachers) will provide valuable input on how to use this tool as part of their curricula) 3. We will promote our work on the University of Vlora Research page as well as various conferences such as the Kosova Freedom Software conference where Ervin Ruci and Eustrat Zhupa are scheduled to present a paper on cross-language entity recognition systems in August 2009. “Different languages divide us, but information technology erases that division”. 4. We are always looking for mentors and supporters in our quest to develop better tools for information management and processing, so as to get closer to our goal of using technology to improve the quality of information we receive across different languages. 5. The mentor will be someone with access to the natural language processing groups in the world who can provide valuable advice and guidance i our work. Would your Project benefit from Support, Documentation and/or Testing people?

Page 4

[edit] 8. Timeline (Show start to finish) 1. Designing the main outline of the interfaces and the systems for sharing and gathering information. (2 months) 2. Coding the Algorithms that will create language independent functionality across different language bases using Hidden Markov Models and probabilistic learning algorithms for analysing and processing information across different languages. (8 months) 3. Testing, Documenting, optimizing the software to function on laptops with low processing power and storage capacity. (4 months) 4. Porting the application to other platforms and developing the off-line technology for syncing all work done by individuals into a central repository. (4 months)