Speech Server

From OLPC
Revision as of 11:06, 1 February 2008 by Assim.deodia (talk | contribs)
Jump to navigation Jump to search

Speech Synthesis Server

Description

An easy to use api for speech synthesis which would be useful for self-voicing activities.

Existing Tools Present

Please note that this is work in progress, and this API is currently NOT STABLE.

Modifiable Speech Parameters

Refer espeak Library API to understand usage.

espeakRATE        = 1
espeakVOLUME      = 2
espeakPITCH       = 3
espeakRANGE       = 4
espeakPUNCTUATION = 5
espeakCAPITALS    = 6

Modifiable Voice Parameters

Follow the espeak Voice Description

language   -> language(In standard notation)
name       -> a given name for this voice. By default it should be set NULL
identifier -> the filename for this voice within espeak-data/voices.
              Should be set NULL by default.
gender     -> voice gender( 1 = male, 2 = female, 0 = unknown)
age        -> By default age = 0. espeak automatically sets it.
variant    -> Used to modify the voice by different variant. Preferably 0.

A speech synthesis dbus daemon service

  • A dbus-glib based speech server written in C.
  • Speech server reuses the libespeak library.
  • Methods exposed via D-BUS for performing certain tasks globally to all xo activities.

Speech Server Dbus Methods

SayText()

This Dbus Method accepts an incoming UTF8 string, and plays it back.

SayText(string text)
  • Speaks the text pointed by string.
  • Must be a valid UTF 8 String

Example Python Code:

import dbus
bus = dbus.SessionBus()
espeak_object = bus.get_object('org.laptop.Speech','/org/laptop/Speech')
espeak_object.SayText("Hello world! Hey I can talk to you")
SetVoice()

This Dbus Method accepts parameters to set the voice parameters for espeak. Refer to the Modifiable Voice Parameters for more details. This methods can be used to configure the voice properties of espeak.

SetVoice(String name,
	 String languages,
	 String identifier,
	 int gender,
	 int age,
	 int variant,
	)

Python Example:

import dbus
bus = dbus.SessionBus()
espeak_object = bus.get_object('org.laptop.Speech','/org/laptop/Speech')
espeak_object.SetVoice("","fr", "", 2,0,0)
#Choose a female voice to speak french text
espeak_object.SayText("Je suis une etudiante!")
SetParameter()

This Dbus Method accepts parameters to set the speech parameters for espeak. Refer to the Modifiable Speech Parameters for more details.

SetParameter(int PARAMETER_NAME, int PARAMETER_VALUE)

Python Example:

import dbus
bus = dbus.SessionBus()
espeak_object = bus.get_object('org.laptop.Speech','/org/laptop/Speech')
espeak_object.SetParameter(1, 60)
#Modifies the espeakRATE parameter to speak 60 words per minute
espeak_object.SayText("I am a very lazy speaker!")
GetConfiguration()

This Dbus Method returns a dbus.Structure This is required for getting the current settings of the espeak service. It is required to display the present settings of espeak in the control panel that will be made available for tuning the espeak parameters.

GetConfiguration()

Python Example:

import dbus
bus = dbus.SessionBus()
espeak_object = bus.get_object('org.laptop.Speech','/org/laptop/Speech')
espeak_object.SetParameter(1, 60)
espeak_object.SetVoice("","en-uk", "", 2,0,0)
val = espeak_object.GetConfiguration()
print val
SaveConfiguration()

This Dbus Method allows the user to save the current espeak parameters.

SaveConfiguration()

Python Example:

import dbus
bus = dbus.SessionBus()
espeak_object = bus.get_object('org.laptop.Speech','/org/laptop/Speech')
espeak_object.SetParameter(1, 60)
espeak_object.SetVoice("","en-uk", "", 2,0,0)
espeak_object.SaveConfiguration()
#Will overwrite the existing user settings


LoadConfiguration() and LoadDefaultConfiguration()

This Dbus Method allows the user to set all espeak parameters to his/her preferences which they are required to save.

LoadConfiguration()
or 
LoadDefaultConfiguration()

Python Example:

import dbus
bus = dbus.SessionBus()
espeak_object = bus.get_object('org.laptop.Speech','/org/laptop/Speech')
espeak_object.LoadConfiguration()


Installing speech-dispatcher on the xo

You'd like to download the following rpm packages on a pen drive/or install them through yum.

Here is a description of how to bring up the speech-server by installing all packages through the rpms.

Considering the packages are available at the root of your pen-drive:

Perform the following activities as root.

cd /media/[name of pen drive]/
rpm --install dotconf-1.0.13-1.fc5.i386.rpm 
rpm --install --nodeps speech-dispatcher-0.6.1-1.fc5.i386.rpm

'At this stage you may get some warnings/errors about festival/flite and python-abi not being present. You can safely ignore these warnings'

We now modify the configuration files to make speech-dispatcher use eSpeak (already available on the xo)

vi /etc/speech-dispatcher/speechd.conf

Modify the file as follows: (uncomment the AddModule "espeak-generic" line and change the DefaultModule line)

#AddModule "flite"        "sd_flite"     "flite.conf"  "/var/log/speech-dispatcher/flite.log"
AddModule "festival"     "sd_festival"  "festival.conf" "/var/log/speech-dispatcher/festival.log"
AddModule "espeak-generic" "sd_generic" "espeak-generic.conf " "/var/log/speech-dispatcher/espeak.log"
#AddModule "epos-generic" "sd_generic"   "epos-generic.conf" "/var/log/speech-dispatcher/epos.log"
#AddModule "dtk-generic"  "sd_generic"   " dtk-generic.conf" "/var/log/speech-dispatcher/dtk-generic.log"
#AddModule "ibmtts"       "sd_ibmtts"    "ibmtts.conf" "/var/log/speech-dispatcher/ibmtts.log"
#AddModule "cicero"        "sd_cicero"     " cicero.conf"  "/var/log/speech-dispatcher/cicero.log"

# The output module testing doesn't actually connect to
# anything. It outputs the requested commands to standard output
# and reads responses from stdandard input. This way, Speech Dispatcher's
# communication with output modules can be tested easily.

# AddModule "testing"

# DefaultModule selects which output module is the default.
# You must use one of the modules loaded with AddModule.

#DefaultModule flite
DefaultModule espeak-generic

We now correct the espeak command-line calls to correctly open espeak.

vi /etc/speech-dispatcher/modules/espeak-generic.conf

Modify the file as follows:

GenericExecuteSynth "espeak --stdout -v $VOICE -s $RATE -a $VOLUME -p $PITCH \"$DATA\"|aplay"

Now start the speech-dispatcher service and test if it works correctly

speech-dispatcher -d
spd-say "Yes this should work"

Voice Files

These are some voice samples which give better voice quality than the default ones To use these files:

  • Create a new file in eSpeak\espeak-data\voices folder say with name testvoice
  • Copy any one of these in that file and save it
  • Run on terminal espeak -vtestvoice "testing new voice"
name english
language en-uk  2
gender male

pitch 82 117
replace 03 I i
replace 03 I2 i
echo 30 30
formant 0 100 100 150
voicing 200
name english
language en-uk  2
gender female

pitch 82 100
echo 10 25
formant 0 100 100 150
voicing 100
roughness 1
flutter 1

Codebase

The code for the project can be accessed in the git repository at | Screen Reader GIT

Team

Core Team :

  • Assim Deodia
  • Cody Lodrige
  • Hemant Goyal

Mentor : Arjun Sarwal