Speech synthesis: Difference between revisions

From OLPC
Jump to navigation Jump to search
(+section "See also". +See also: Speech recognition.)
(noted espeak is included on the xo. example use.)
Line 4: Line 4:
==Scope==
==Scope==
This article is for collecting ideas and resources for using text-to-speech (TTS) [http://en.wikipedia.org/wiki/Text_to_speech speech synthesis] on the XO.
This article is for collecting ideas and resources for using text-to-speech (TTS) [http://en.wikipedia.org/wiki/Text_to_speech speech synthesis] on the XO.

== espeak ==

Espeak is currently included on the xo.

$ espeak --stdout "Hello listening world!" | gst-launch fdsrc fd=0 ! wavparse ! alsasink



==Existing software==
==Existing software==

Revision as of 23:00, 27 September 2007

This article is a stub. You can help the OLPC project by expanding it.


Scope

This article is for collecting ideas and resources for using text-to-speech (TTS) speech synthesis on the XO.

espeak

Espeak is currently included on the xo.

$ espeak --stdout "Hello listening world!" | gst-launch fdsrc fd=0 ! wavparse ! alsasink


Existing software

There are FOSS Free Open Source Software Speech-Synthesis packages which run on devices comparable to the XO. We are much more concerned with localization than is typical. And dialects can be a political issue. But TTS would help with Accessibility. And could be very cool.

Speech synthesis has a set of complex tradoffs of synthesizer size versus fidelity versus effort to localize a new language. The Wikipedia speech synthesis article discusses software that is available, which includes festival, flite, and espeak.

Espeak is small enough for us to often bundle and covers quite a few languages: ~10 languages currently supported tuned by native speakers. Localization to ten more languages is underway.

Synthesis is essential for accessibility to content by people with vision problems, and will need to be integrated with the ATK library used, as well as literacy training, other uses as part of a GUI. Full localization therefore involves selection of a suitable synthesis system and integration into the ATK framework, along with localization of that system for the particular language involved.

Speech synthesis is usually not a good guide for pronunciation – but it may be better than a poor teacher who has never had the opportunity to learn from a native speaker of that language.

The state of the art

Commercial Text-To-Speech programs are getting very good now. The examples at the Digital Future Software Company site are very clear. They use AT&T technology and provide examples of Male and Female speech in English, French and Spanish. The XO needs open-source software that can approach this quality in a wide range of languages.--Ricardo 04:07, 17 August 2007 (EDT)

Resources

See also