Speech in STD97

This file contains information about the lab in the Speech part of the course Språkteknologiska Delområden VT97. This lab contains two opportunities to test speech software and one assignment.

Speech Recognition Test
Speech Synthesis Test
Speech Synthesis Assignment

The software which you will use in this lab is called UMS (Ultimedia Services). The software and documents for UMS can be found in the directory:

/usr/lpp/UMS

This AIX software is machine dependent but it should work on all machines in the lab room. However if you discover any problems then please report them to your teacher.

1. Speech Recognition Test

In this part of the lab you will test the speech recognition capabilities of the Ultimedia Services software. To be able to do this part you need a microphone. If you are unable to get one continue with another part of this lab and do the speech recognition part later.

At the back of the computer box under the table you will find four round sockets. Put the microphone plug in the socket that is closest to the ventilator (the top one). Now start the program:

run_ums DtNavigate

Two windows should appear on the screen: the Speechbar window and the Navigator window. The Speechbar window controls the microphone. The Navigator window shows you the speech commands that you can use to control your window system. It also has a facility for adding new commands.

Now turn the microphone on by chosing Recognizer from the menu bar in the Speechbar window. The background color of the microphone image will change from grey to white. Now you can test the speech recognition capabilities of the system by reading some commands from the Global list in the Navigator window. The program will show each command that it has recognized in the Speechbar window. How many commands from the Global List does the program recognize from your voice?

If you want to know more about speech recognition using UMS you can read the help texts that are available from the Navigator window or read the file:

/usr/lpp/UMS/speech/speech_reco/README.NAVIGATION

2. Speech Synthesis Test

In this part of the lab you will test the speech synthesis capabilities of the Ultimedia Services software. Start with entering the command:

speak

Three windows will be opened:

The tnt window (Type aNd Talk): you can enter text in this window and control the pitch and the tempo with which it will be spoken.
The talking head window: the talker in this window will perform as an actor that is reading your text.
The Master Volume window: you can control the volume of the spoken text with the scrollbar in this window.

Now enter a few English words in the tnt window and press the Speak button to make them being spoken aloud. Test different settings for volume, pitch and tempo to see which ones you like best. Do you like the performance of this speech synthesis program? How does it perform for Swedish sentences?

If you want to know more about speech synthesis using UMS you can read the help text that is available from the tnt window or read the file:

/usr/lpp/UMS/Demos/tts/README

You can also test a few a few speech synthesis sites on World Wide Web. There is a list of them on the speech links page for this course.

3. Speech Synthesis Assignment

In this assignment you will adapt the speech dictionary of the Ultimedia Services (UMS) speech software in such a way that it can pronounce a small piece of Swedish text in a reasonable way. Before you do this assignment you should have done the speech synthesis test. You will write a small report about this assignment. The mark you will receive for the report and the adapted dictionary will be your mark for the speech processing part of this course. You can either do this exercise on your own or work in a pair.

3.1 Introduction

In the speech synthesis text you have performed text-to-speech conversion by using the X-windows application speak with the talking head. UMS also has a facility for performing text-to-speech conversion by using simple commands. You can for example type:

echo "i Uppsala" | txt2spch

after which you will hear the prepositional phrase "i Uppsala". You may want to adjust the volume of the speech output. This can be done by starting the Master Volume application:

volume

You will have noticed that the pronunciation of the Swedish preposition "i" is incorrect. The software will pronounce this preposition as the English pronoun "I". Fortunately it is possible to change this. The software makes use of a speech dictionary which can be edited:

run_ums dictedit

This command will start the dictionary editor. You can choose an arbitrary word from the Dictionary Contents field by clicking on the word. It will appear in the top field and a phonetic representation of the word will appear under it. You can make the program pronounce the word or the phonetic representation. You can modify the entry or delete it.

The word "i" does not have an entry in the dictionary so we will add it:

Enter "i" in the English Word field.
Enter "<<~IYIY>>" in the Sounds like field.
Test the pronunciation to verify that this is the sound you are looking for.
Click on Insert Entry to insert the entry in the dictionary
Click on Save Dictionary to save the dictionary in one of your own directories.
Warning: don't save the dictionary in the default system directory. You will not get any error message and the dictionary will be lost.

You can now make the UMS software use your dictionary by:

export TTSDICTIONARY=yourDirectory/yourFile
echo "i Uppsala" | txt2spch

You should now hear a better approximation of the Swedish preposition "i" instead of the English pronoun "I". If you have managed that you are ready to start with the main task of this assignment. Otherwise try to find out what went wrong or ask for help.

3.2 Main Assignment Task

Your teacher has a series envelopes containing a paper with a small piece of Swedish text. There is one envelope per student. Get one of these. Your task is to adapt the dictionary of the UMS software in such a way that it can pronounce that piece of Swedish text in a reasonable way. You do not need to specify all words of your text in the dictionary. It might be so that some words are pronounced reasonably already. You can make this assignment on your own or work in a pair.

A perfect result cannot be achieved in this assignment because certain Swedish sounds are not available in the UMS software. A reasonable goal is to improve the pronunciation of Swedish words.

3.3 Tips

The sound definition contains five obligatory characters: three prefix characters <<~ and two suffix characters >> .
You cannot use the three vowels å, ä and ö in the dictionary because the speech software will not pronounce them and treat them as word boundaries. You may choose as alternatives: aa for å, ae for ä and oe for ö. If this gets you into trouble please consult the teacher so we can choose the same solution for everyone.
In order to make this assignment you will need a list of phonemes that can be used in the dictionary and a list of words that are present in the English dictionary. If you do not know what phoneme to use for a Swedish word, try to think of English word which has a similar phoneme and use the phoneme which has been used in the dictionary for that word.
Save your dictionary often and make a backup for yourself. When the dictedit program crashes you may loose your dictionary.

3.4 Report

You will have to write a report of at least two pages (three if you work in a pair) about this assignment. No more than two people can submit the same report. Your report should contain at least the following parts:

Introduction: a description of your task in this assignment.
Speech synthesis in Swedish: here you describe what you have done, what things you couldn't find a satisfactory solution for and the things that could be solved properly.
Concluding remarks: general conclusions about the assignment you have done and your expectations for the development of speech synthesis in the future.

You can add your Swedish dictionary entries to the report in an appendix on a separate page. Note: if you want to display the strings that are in your dictionary then use the command:

tr -s '\000' '\012' < yourDict | tail +1850 | more

You can write your report in English or in Swedish. Your report and your extended dictionary will be graded with a mark between 1 and 10 (inclusive). The deadline for handing in the reports is Sunday May 25, 1997. Reports handed in after that day will receive a 1 point penalty per extra day.

Last update: May 27, 1997. erikt@stp.ling.uu.se