User:Assim.deodia

Assim Deodia

@	assim.deodia at gmail.com {{{2}}}

This user is a Under Graduate student– Information Technology at Netaji Subhas Institute of Technology

Listen and Spell

I am working on Listen_and_Spell activity as a part of my GSoC project under the mentor ship of Dafydd Harris. This section maintain the updates and the discussion regarding the project.

git Repository: git Repository
Project Page: GSoC page

Weekly Updates

May 24th - May 31st

The Start

Initial discussion with daf on the project design which finalized

Using ElementTree module of python as the parser for XML dictionary files.
Using SQLite as a temporary storage space for the parsed dictionary so that it is easy to access data without parsing the XML again.

Work Details

Started with reading about python especially ElementTree class
Made a small sized dictionary file from gcide and making a basic class which initially only parsed the file and display its content in more readable form.
Modified the XMl file to remove all the characters starting with "#&" as they were unreadable by parsers and were giving errors of "Unidentified entity"
Applied for git repository.

June 10th - June 14th

Work Details

Extended the class structure to include some more functions
Read about SQLite
Added the SQLite interface to the dictionary class which parses the dictionary and stores all the data in a database. This class is available at the git repository.
Fixed the error of "Unidentified entity" by adding the "DOCTYPE", which defines entities for xml files, to all the XML files.
Again applied for git repository as it's not been created yet.
Repository created :). First git commit
Bug fixing dictionary.py
Second git commit

Issues to be tackled

The "DOCTYPE" has to be included in each xml file separately and since the size of DOCTYPE section is large, it is adding to file size. It can in placed in one file and that file can be included in each xml dictionary.
~~GCIDE dictionary's are categorized by starting letters which have to re-categorized with respect to difficulty level.~~
~~The XML schema of the GCIDE dictionary are different from what required. It has to redesigned according to required schema.~~
~~Merging of above class can be done only after the above task are completed.~~

June 15th - June 21st

Work Details

Downloaded the Wordnet Dictionary (Size after dumping in sql server: 379 MB)
Modified the schema to words + definition + sample only and removed all the extra tables
removed all the words containing special characters (like '.', '-' etc) from the dictionary and their corresponding definition and sample. (using script index.php from git)
Size after reduction(sql dump): 24 MB.
Converted (Using script conv.sh [1] MySQL to SQLite3 (shell script) ) to SQLite database: dict.db (Size 12 MB)
Added a new column `length` which stores length of each word
Created a word class in dictionary.py which interfaces with SQLite DB.
Modified dictionary class to remove the interface from XML and interface with SQLite DB
Bug Fixing
Third git commit
Forth and Fifth commit: removing unused files

June 21st

6th commit: V1.0 of listen and spell(command line version)

Immediate task at hand

~~Merging above class with talkntype and create V1.0 of listen-spell~~
Testing the application
Include new features( writing the response in the dictionary)
Implement a keyboard listener with callback in command line version

User:Assim.deodia

Contents

Listen and Spell

Weekly Updates

May 24th - May 31st

The Start

Work Details

June 10th - June 14th

Work Details

Issues to be tackled

June 15th - June 21st

Work Details

June 21st

Immediate task at hand

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

About OLPC

About the laptop

About the tablet

Projects

OLPC wiki

Tools