User:Hemant goyal

From OLPC
Jump to: navigation, search
@ goyal.hemant
gmail.com
IRC Nick: hemantg
#sugar
Server: freenode
Nuvola apps kworldclock.png This user's time zone is UTC+5:30.

Hemant Goyal

About

I am an undergraduate student at Netaji Subhas Institute of Technology [1], majoring in Information Technology. I have been accepted as a Google Summer of Code Student for my project Integration of Speech Synthesis in Sugar Environment [2]. Previously I was involved with the OLPC community as a volunteer and Summer of Content Intern.

Personal Website: hemantg's personal website lists some of the work that he has done [3].

Google Summer of Code 2008

Project Name: Integration of Speech Synthesis in Sugar Environment

Project link: http://code.google.com/soc/2008/olpc/appinfo.html?csaid=711720542810A05F

Mentor: Simon Schampijer User:Erikos

Other gsoc projects under OLPC

Milestones

(Added on 13th July, 2008)

  • Get the python-dotconf package accepted into the Fedora Repository (This is going to take its own sweet time, as it depends on the speed of the reviews. Perhaps I will ask someone in the OLPC community to help speed-up the review process.)
  • Get the speech-dispatcher package included into the OLPC builds. [4]
    • Make the speech-dispatcher daemon run as non-root.
    • Re-locate the configuration files for the OLPC3 package to /home/olpc/.sugar-speech/
  • Finalize the mockups for sugar-control-panel - Speech Configuration with eben and erikos.
  • Write patches for sugar-control-panel and push to git.

Project Milestones already completed:

Sugar Speech Synthesis Interface

Interface through Control Panel

  • Play/test button in control panel.
  • Finalize the test string that will be spoken.
  • Fine grained settings within control panel for :
    • Language - Default to locale set on XO
    • Voice Selection - Male/Female, Child/Adult, Age
    • Rate
    • Pitch
    • Volume

Sugar Interface

  • Play button
  • Reveals a palette to modify the a subset of the settings mentioned above.

Weekly Logs

Week Ending - Jun 10, 2008

Executive Summary

In this period, I worked on the RPM package of speech-dispatcher and fixed several shell scripts and subtle problems with the package. I spent the week discussing the design and implementation of each Use Case [5]. A potential problem with speechd API has been identified and it has been decided that the speechd API be modified suitably and that we request speech-dispatcher community to accept the patch and ship it upstream. The priority tasks for the next week are : RPM package completion, speechd API modification, sugar speech settings preservation.

RPM Packaging update
speech-dispatcher RPM package

In the past two weeks I have primarily worked on fixing various directory and file ownership issues for the RPM package. I worked on the init script for auto-starting the speech-dispatcher daemon, and fixing certain scriptlets for the installation of speech-dispatcher. Finally, I am also working to get sponsored from the Fedora Maintainers community so that I can build the speech-dispatcher package in the Red Hat build system. I wrote two informal reviews of new RPM packages for this purpose.

The changelogs for the RPM packaging are given below -:

* Sat Jun 07 2008    Hemant Goyal <goyal.hemant@gmail.com> 0.6.6-8
- converted speech-dispatcher-cs.info to UTF-8 encoding
- removed multiple file listings of /usr/lib/python2.5/site-packages/speechd/_test.py and fixed its mode
- added init script as a SOURCE instead as a patch
- duplicate Requires have now been removed
- Timestamping of files has now been added
- Install script fixed
- init script fixed

* Tue Jun 03 2008    Hemant Goyal <goyal.hemant@gmail.com> 0.6.6-7
- changed license of base package to GPLv2+ and GPL
- changed license of all other packages to GPLv2+
- fixed install sequence using cleaner for loop and pushd and popd commands
- added init script for speech-dispatcher daemon
- added COPYING to doc in base package
- removed comment after /sbin/ldconfig
- resolved rpmlint errors for base package [except UTF-8 encoding error for (cs) documentation file]
- renamed long_message to spd_long_message and run_test to spd_run_test
- reset mode of _test.py to 0755

For older changelogs you can access http://www.nsitonline.in/hemant/stuff/speechd-rpm/speech-dispatcher.spec

Design Considerations
speechd python API

We have discovered a potential problem with the python bindings for speech-dispatcher which force the clients to use threads when speechd API is used along with pyGTK applications. The entire discussion with the speechd and pyGTK community is available here : http://www.mail-archive.com/pygtk@daa.com.au/msg15868.html and http://lists.freebsoft.org/pipermail/speechd/2008q2/001200.html

Several solutions have been proposed -:

ideas for speech-dispatcher sugar default settings preservation

We are currently analyzing the possible ways to store the sugar speech settings -:

  • a decision has been taken that /etc/speech-dispatcher/speechd.conf will not be modified
  • sugar will maintain its own default parameters file that will be read when sugar is started up.
  • a decision about how other clients can use and access the sugar default settings still needs to be taken

Week Ending - Jun 20, 2008

Executive Summary

In this week, I continued working on the speech-dispatcher package and released three new versions for the package. The good news is that I have finally been sponsored by another Fedora reviewer and that the speech-dispatcher package has passed the Q/A tests. The next step is to build the package in Fedora koji system. I expect that the packaging work will complete by this weekend.

I analyzed the working of speechd client API in details and worked through its implementation. For this I had to study about python threads and revise some of the synchronization algorithms that I studied as part of my operating systems course. So its nice to be able to directly apply the theoretical knowledge for a change :). I am conducting discussions over possible solutions for modifying the API for use in the OLPC community with the speechd developers. The discussion is available at http://lists.laptop.org/pipermail/devel/2008-June/015480.html. I will try to modify the API again using the ideas suggested in the coming week.

RPM Packaging update

RPM packaging at the review level is now complete. I will build the package in koji now.

speech-dispatcher RPM package
* Fri Jun 20 2008    Hemant Goyal <goyal.hemant@gmail.com> 0.6.6-12
- added BuildRequires: texinfo (for makeinfo)
- changed permissions of Sourcex to 0644
- incorporated modified init script by mtasaka
- fixed a few more macros in changelog
- modified location of Source1 and Patch0 to point to online locations

* Wed Jun 18 2008    Hemant Goyal <goyal.hemant@gmail.com> 0.6.6-11
- fixed encoding of speech-dispatcher-cs.info file to UTF-8

* Wed Jun 11 2008    Hemant Goyal <goyal.hemant@gmail.com> 0.6.6-10
- removed Requires(preun) duplicates
- applied -p option correctly to install command
- fixed macros in changelog to prevent them from exapnding
- fixed the init script
- added patch to change log directory of speech-dispatcher and start only espeak

For older changelogs you can access http://www.nsitonline.in/hemant/stuff/speechd-rpm/speech-dispatcher.spec

speechd API
Present method

The client API communicates with the speech server presently in a thread by polling the socket in the _communication() method. Whenever data is read from the socket it is added to a buffer self._com_buffer.append((code, msg, data)) and a semaphore value (self._ssip_reply_semaphore) is incremented. To read data off this buffer _recv_response(self) is called and this method is synchronized with the same semaphore self._ssip_reply_semaphore.

When a client instantiates a connection with the speech server a certain amount of handshaking data is shared between the client API implementation and speech server.

When the handshaking is taking place the gtk mainloop has not yet started and hence we cannot rely on the event generated by gobject to read data off the socket. And if this handshaking does not take place the API implementation blocks the entire process thus leading to a deadlock.

Modified Approach

We are trying to eliminate the need for a thread by monitoring the socket using gobject.io_add_watch(self._socket, gobject.IO_IN, self._socket_read_cb). When the gtk mainloop is running and an input activity is observed on the socket self._socket_read_cb will be called. self._socket_read_cb in essence replicates _communication(). Now I figured that we can allow the polling thread to exist until the handshaking takes place and then shift to the gobject method of monitoring the socket. I achieve this by defining a variable self._first_run. When the handshaking completes I make the polling thread exit (by calling close_thread(self) from the main control thread) . In this time frame I expect that the gtk mainloop will start itself (I am quite sure that it does start actually).

Work Update - Jun 30, 2008

Executive Summary

In the last week or so, I was keeping very busy and could not get much work completed. The speech-dispatcher package for olpc2 and olpc3 branches have been build in Koji. I wrote a hackish python Class to modify the dotconf speech-dispatcher configuration file.

Things that I learnt
  • how to use koji, and build RPM packages on the RH build system.
  • Practical usage of regular expressions in python (re package in python)
  • Found out about kodos - a regular expression construction/testing software. It's really very powerful and even provides example python code to implement the regular expression that we construct.
  • Wrote my first real usable class in python. (the python dotconf interface)
RPM Packaging update

RPM package available for public use.

   * speech-dispatcher-0.6.6-13.olpc2.i386.rpm - http://koji.fedoraproject.org/packages/speech-dispatcher/0.6.6/13.olpc2/i386/speech-dispatcher-0.6.6-13.olpc2.i386.rpm
   * speech-dispatcher-0.6.6-13.olpc2.src.rpm- http://koji.fedoraproject.org/packages/speech-dispatcher/0.6.6/13.olpc2/src/speech-dispatcher-0.6.6-13.olpc2.src.rpm
   * speech-dispatcher-debuginfo-0.6.6-13.olpc2.i386.rpm - http://koji.fedoraproject.org/packages/speech-dispatcher/0.6.6/13.olpc2/i386/speech-dispatcher-debuginfo-0.6.6-13.olpc2.i386.rpm
   * speech-dispatcher-devel-0.6.6-13.olpc2.i386.rpm  - http://koji.fedoraproject.org/packages/speech-dispatcher/0.6.6/13.olpc2/i386/speech-dispatcher-devel-0.6.6-13.olpc2.i386.rpm
   * speech-dispatcher-doc-0.6.6-13.olpc2.i386.rpm - http://koji.fedoraproject.org/packages/speech-dispatcher/0.6.6/13.olpc2/i386/speech-dispatcher-doc-0.6.6-13.olpc2.i386.rpm
   * speech-dispatcher-python-0.6.6-13.olpc2.i386.rpm - http://koji.fedoraproject.org/packages/speech-dispatcher/0.6.6/13.olpc2/i386/speech-dispatcher-python-0.6.6-13.olpc2.i386.rpm
speechd API

I will be waiting for a bug in python 2.6 to be resolved and hence the API change which we were planning to bring about has been postponed now.

dotconf python interface

I wrote a python interface to fiddle with the speechd.conf file for overriding client speech synthesis settings and establishing sugar defaults.

Update : July 6 - July 14

Work Update - Jul 6, 2008
  • Continued tweaking the python dotconf interface. Have released it for code review by erikos
  • Requested for git user access - access granted - will be uploading the code in my public folder as soon as my linux installation is fixed
  • downloaded sugar-jhbuild on my ubuntu box and looked at the sugar-control-panel codebase - next step is to start writing the code for integrating speech synthesis management through sugar-control-panel
  • bricked my ubuntu box :( - spent the whole day trying to get ubuntu to install on my laptop
Unresolved Issues
  • still no reply from dot.conf community about the generic python parser for dotconf - I suppose I'll use the hackish class for olpc only
  • location of /etc/speech-dispatcer/speechd.conf on the olpc laptop. Since sugar-control-panel needs to read/write to speechd.conf I will modify the speech-dispatcher package for olpc2/olpc3 to relocate speechd.conf to /home/olpc/.sugar-speech
Work Update - Jul 9, 2008
  • Restructured pydotconf API. It looks much neater and documented the code. The code is available at : http://www.nsitonline.in/hemant/stuff/pydotconf/
  • Still cannot access my git folder. Will be requesting the sysadmin to change my public key now.
Things Learnt
  • python packaging - distutils
  • python automatic documentation - doxygen
  • found out more about the GPL and LGPL license, and licensed my code under GPL :P
Plus Points
  • speech-dispatcher community is interested in using pydotconf in their project :)
Work Update - Jul 14, 2008
Things Learnt
  • learnt how to use svn command line tool

Thanks

erikos is mentoring me on this project. tomeu, eben, and daf have also been helping me through the project design and various coding problems that I have faced. My thanks are also due to SJ for keeping us updated about the GSoC programme requirements. Kudos to all of you for being so helpful!

Cheers!