Free Icon-To-Speech, under development
The goal of Free Icon-To-Speech is to provide a low-cost assistive / augmentative communication tool for people with speech, motor, and/or developmental challenges. The immediate opportunity is to create open source software to allow a user to select concepts through a menu of icons, and synthesize speech from those selected concepts.
Existing tools in use for this purpose are priced in the thousands of dollars per device, and proprietary.
The OLPC XO platform, while not having a touch screen, is priced in the hundreds of dollars per device, and already contains many of the base components needed (evident in the text-to-speech synthesis activity Speak). The OLPC icon-to-speech need has been expressed by many people independently, including discussions at Talk:Speak#Accessibility and Talk:Accessibility#Augmentative_and_Alternative_Communication.
It appears that a proof of concept could be developed with a small time investment, and potential users are ready to test as soon as this is complete.
An alpha version is now posted at https://launchpad.net/freeicontospeech . Next steps and blueprints are also now posted. Since the icons from http://www.imaginesymbols.com are only free for personal use, for now you can download and will need to resize the images listed in the XML files to 250x250 with imagemagick http://www.imagemagick.org/script/api.php and put them in the icons folder.
[Tony Anderson advises that on newer XO builds, you will need to change the permissions on the tmp folder to 777.]
OLPCNews has published an article on this project. Thanks OLPCNews!
User Interface Design
Initial discussions suggest a user interface which allows users to navigate a hierarchy of basic concepts, allowing some variability of detail / zoom, due to the variability of users' motor skills used to select concepts.
3 levels of hierarchy at 7 +/-2 groups/concepts per level would allow selection among hundreds of concepts, which appears to be a useful balance between richness of expression and speed of selection.
Display and navigation of the hierarchy can be a combination of existing concentric & zoomable menu approaches:
- Zoomable UI http://www.cs.umd.edu/hcil/pad++/sitemap/
- Dasher http://www.inference.phy.cam.ac.uk/dasher/DasherSummary2.html
- Fractal:Edge http://fractalmaps.com
We envision three such navigation areas, displayed from left to right across the screen, for the selection of a subject, a verb, and an object of a basic sentence, with no attempt at grammatical accuracy.
The concept hierarchy can be synthesized from a careful blend of existing taxonomies. For an initial proof of concept, two useful taxonomies are from sign language and the food pyramid. Use of sign language extends all the way to toddlers, as an increasingly popular supplemental communication before they develop speech abilities, such as the "Sign With Your Baby" materials. 100 basic signs provide some of the most useful concepts for basic living: http://www.lifeprint.com/asl101/pages-layout/concepts.htm . Sign language may be doubly useful in some cases, when motor skills allow for communication with the manual signs. Icon libraries are already established for American Sign Language, and readily available for many of the USDA food pyramid categories: http://openclipart.org/media/tags/vegetable. Those familiar with Blissymbols would find them most useful as the icon set. Most users may benefit more from cliipart-style icons.
To the goal of assistive communication devices serving the _whole_ person, a simplified rationalized / combined set of categories of human needs from Neef and from Maslow covers much ground (work in progress):
|Neef||Maslow||people categories||place categories||thing categories||action categories||adjective categories|
|subsistence, protection||biological, security||family, caregivers||home, health facilities||drink, food, clothing, shelter, body, health||feed, clothe, exercise, rest, take care of, help||health, adaptability, autonomy|
|affection, leisure, participation||social||family, friends||home, privacy, intimate spaces of togetherness, landscapes||feelings, nature, games, parties, customs, values, norms, communication technology||communicate, share, take care of, love, have fun, cooperate, dissent, express opinions||respect, generosity, imagination, receptiveness, dedication, humor, sense of belonging|
|understanding||growth||parents, teachers, mentors||homes, schools, communities||skills, work, techniques, problem solving, literature, education||learn, meditate, investigate, plan, grow||independent thought, curiosity, intuition|
|creation, identity, freedom||esteem||peers, mentees, community, society||associations, parties, churches, neighbourhoods, spaces for expression||morality, creativity, spontaneity, lack of prejudice, abilities, language, religions, rights||dream, remember, relax, invent, build, design, work, accomplish, interpret, commit, choose, risk, develop awareness||imagination, boldness, inventiveness, curiosity, self-esteem, consistency, autonomy, passion, self-esteem, open-mindedness|
Developing appropriate and free and open source icons for this project is a challenge that the community/wiki could take on. Many users of Augmentative and Alternative Communication devices face visual, perceptual, and cognitive challenges. Therefore, icons should be as uncomplicated and transparent as possible. Examples: Mayer-Johnson symbols are widely used in American schools because the stick drawings are easily scalable and widely considered the most transparent for more abstract ideas. They are less concrete than pictures, however, which might pose a problem for early learners. They are also very heavily copyright protected, which does not coincide with OLPC's software freedom standard. www.mayer-johnson.com
Prentke Romich's symbols, also proprietary. support everything from early learning up to sophisticated semantic encoding to increase rate of messages. (i.e. swimming pool icon + color icon = blue or swimming pool + activity icon = swim)
The Tango! by Blink Twice also has a unique encoding system for early learners.
My points are:
1) a large scale Free and Open Source icon library probably needs to be developed.
2) the function of the device also should be considered. For young children and many people with autism and other related conditions,requesting is the first skill worked on -- asking for food/drink -controlling the other's actions to get needs met. For them, pages consisting of simple "I want" then branches to many different food items would be an idea setup.
Other functions of communication include building social closeness with close circle of people, transferring information to others, and participating in social interactions with community ("how are you" / "excuse me" etc.). Each of these functions varies in terms of the importance of the specific content of each message,the importance of the semantics of the message, and whether the communicator will be familiar or unfamiliar (a mom will be able to "read" a nonverbal child's gestures but a police officer might not) The device and page set ups should keep these situations in mind and design accordingly.
It's always been my dream to make the XO into a sophisticated communication device. I've seen families spend thousands on devices that do not meet their children's needs and I would love to be involved with the project any way that I can.
Lesley,br. 01:32, 20 April 2008 (EDT)
Additional Enhancements and Uses
- Input devices:
- larger control surface with external USB trackpad / xpad (such as Wacom, <$100)
- touch panel (driver installation procedure under development) ~$140 E08 http://www.irtouchusa.com/e_pro_list.htm
- Johnny Lee's head tracking from $40 Wii Remote http://www.youtube.com/watch?v=Jd3-eiid-Uw
- head or eye motion driven pointing devices - USB? $? http://www.olpcaustria.org/mediawiki/index.php/Headtracker
- two switch step scanning http://alltogether.wordpress.com/2008/04/03/iconspeak-for-the-xo
- Additional languages & culturally-relevant icons
- scalability needed for this, in terms of ontology & GUI
- vectorize icons - consider method used by www.CopyArtwork.com
- Add to & change the vocabulary & icons with photos, utilizing the built-in OLPC XO camera.
- Run on smaller devices, such as mobile phones, music players, and PDAs with adequate speaker output.
- Ability to operate with more grammatical correctness for more formal situations such as public and educational settings.
- Teaching of reading & writing in native language.
- Teaching of second or foreign languages.
- Selectable foreign language or culture for speech output, enabling basic communication across languages or cultures.
- Recording the selections as near-ontological content warrants further discussion.
- could record these in the Journal
User Interface mock-up, as a slide presentation
Open the slide presentation file: http://wiki.laptop.org/images/e/ec/FreeIconToSpeech_UI_text_demo_02.ppt .
[Work in progress: Icons are not drawn into this diagram yet. So for the moment, imagine that each word in black is replaced by an icon representing that concept.]
Click "people", "mom", "create", "cook", "food", and "beans", imagining the interface zooming in to where your pointer travels, for easier selectability.
Then the computer would consider your selections complete, and speak them.
A presentation on an alternate interface: http://wiki.laptop.org/go/Image:FreeIconToSpeech_Alternative_User_Interface.ppt
Thanks for ideas contributed & discussed at PyCon 2008 by Tony Anderson, Lisa Beal, Annie Barkau, Ed Cherlin, & Mel Chua.
- RMattB 2008 03 17
Please add your thoughts. :)
I just checked the license for imaginesymbols.com. It actually says that the images can be used for non-commercial use. Free, open source software is non-commercial, and thus you will not be violating the terms of the licence by building the icons directly into the software. There is no need to have users download the images themselves/create a new set of icons.
Itamblyn 05:00, 25 November 2008 (UTC)
A request to use the icons was sent to imaginesymbols.com a few weeks ago, but they have not responded. It seems that re-distribution would bypass their requirement for email registration before each download. Perhaps it is time to get a legal opinion on this - more comments along those lines are very welcome. Thanks! --RMattB 05:35, 25 November 2008 (UTC)
I think you are misunderstanding what you are agreeing to when you register. The site states: "Imagine Symbols are available for download for non-commercial use. In order to download the symbols please fill out this simple registration." It then asks you to check off the box "I am using the symbols for non-commercial use only". This project is non-commercial. If they had meant "only for personal use" they would have said that.
Itamblyn 04:15, 26 November 2008 (UTC)
Is there a HOWTO detailing the steps involved to get Icon to Speech up and running as best possible on a fresh OLPC laptop? I see where the source code is, but if there's anything nontrivial about getting it working, I'd really appreciate a HOWTO. (An email would be great!) JonathanHayward 00:29, 16 July 2009 (UTC)