![]() |
Maxine Eskenazi Tel. 412 268-3858
GOALS
Research Goals
Understanding the factors that affect the variability of the speech signal. Creating automatic systems (automatic speech recognition and synthesis) that benefit from this knowledge and that provide real benefit to end users. This endeavor implies studying groups of speakers, input conditions, styles of speech and detecting the acoustic and upper-level indices that are indicative of these variants. Research Interests
I am interested in the variability of the speech signal – its sources and manifestations, whether groups of speakers or some variation in the manner in which they speak or the conditions in which they find themselves. Non-native speech is one particular interest within this area, as is speaking style. I am also interested in the manner in which a foreign language can be taught effectively, either by a human or by a computer. This implies the presentation of the information, the choice of which information to present, and the manner in which the information to be presented is chosen. One present interest here is in a Gestaltist approach in teaching the new sounds of a second language. Another interest is in teaching culture using “pinpointing” as it was developed in my work on non-native pronunciation error detection, where the specific error is shown in context and corrective help is offered specific to the error. That system to detect and correct foreign speakers’ pronunciation errors in English is called Fluency and the basic algorithms developed in that project have been spun off into the NativeAccentTM product sold by the company I started, Carnegie SpeechTM. So, I am very interested in seeing research results in use in real life! - Projects
Fluency – a project to use automatic speech recognition to detect pronunciation errors and to provide appropriate correction information – contact “max at cs dot cmu dot edu” for more information.
Let’s Go – a project using a spoken dialogue system to expand access to such systems to the elderly and to non-native speakers. http://www.speech.cs.cmu.edu/letsgo/
REAP – a project to retrieve appropriate, individuated texts for students learning to read http://hartford.lti.cs.cmu.edu/Reap/ - Students
PhD – Antoine Raux MLT – Jonathan Brown MCALL – Jong Hyun Lee CS Undergrad – Aleata Hubbard - Publications
Computer-Assisted Language Learning * Pronunciation Eskenazi, M., (1999) Issues in the use of speech recognition for foreign language tutors, invited paper: Language Learning and Technology Journal (online) Vol. 2, No. 2, January 1999, pp. 62-76. http://llt.msu.edu/vol2num2/article3/index.html
Probst, K., Ke, Y., Eskenazi, M., 2002, Enhancing foreign language tutors - in search of the golden speaker, Speech Communication, 37/3-4 pp. 161-173.
Eskenazi, M., Pelton, G. 2002, Pinpointing pronunciation errors in children’s speech: examining the role of the speech recognizer, Proposed to the Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology Workshop, Sept 2002, Colorado. .pdf file
Eskenazi, M., Ke, M., Albornoz, J., Probst, K., 2000. Update on the Fluency Pronunciation Trainer, In: Proceedings of InSTIL 2000, Dundee. .pdf file
Mayfield Tomokiyo, L., Wang, L., Eskenazi, M., 2000, An Empirical Study of the Effectiveness of Speech-Recognition-based Pronunciation Training, Proc. ICSLP 2000, Beijing.
Eskenazi, M., Hansma, S., 1998, The Fluency Pronunciation Trainer, Proc. STiLL Workshop on Speech Technology in Language Learning, Marhollmen, May. .pdf file
Eskenazi, M., Hansma, S., Semp, M., Warner, R., 1998, By ear and by eye - adaptive tutoring for foreign language pronunciation training – in Proc. STiLL Workshop on Speech Technology in Language Learning Marhollmen. .pdf file
* Reading Brown, J., Eskenazi, M., 2004, Retrieval of Authentic Documents for Reader-Specific Lexical Practice, Proceedings INSTIL 2004, Venice Italy. .pdf file
* Non-native speech Raux, A., Eskenazi, M., 2004, Using Task-Oriented Spoken Dialogue Systems for Language Learning: Potential, Practical Applications and Challenges, Proceedings INSTIL 2004, Venice. .pdf file Raux, A., Eskenazi, M., 2004, Non-native users in the Let’s Go!! Spoken Dialogue System: Dealing with Linguistic Mismatch, Proceedings HLT 2004, Boston. .pdf file Raux, A., Langner, B., Black. A., Eskenazi. M., 2003, LET’S GO: Improving Spoken Dialog Systems for the Elderly and Non-natives, Proc. Eurospeech 2003, Denver. .pdf file * Elderly speech Eskenazi. M., Black, A., Simmons, R., 2002, Elderly Perception of Speech from a Computer, Meeting of the Acoustical Society of America, Pittsburgh, June 2002. .pdf file Eskenazi, M., Black, A., 2001. A study on speech over the telephone and aging, Proc. Eurospeech01, Aalborg, Denmark September 2001. html link Speaking Styles Eskenazi, M. 1993. Trends in Speaking Style Research, Keynote speech, Proceedings Eurospeech’93, Berlin. .pdf file
Eskenazi, M., 1995, Hot Topics in Speaking Style Research, in European Studies in Phonetics and Speech Communication, Bloothooft, Hazan, Huber, Llisterri, eds., OTS Publications, The Netherlands. P. 58 - 62.
Eskenazi, M., Lacheret, A., 1991, Exploration of individual strategies in continuous speech, Speech Communication, vol. 10 no. 3.
Eskenazi, M. 1992. Changing speech styles, speakers’ strategies in read speech and careful and casual spontaneous speech. Proceedings of the International Conference on Spoken Language Processing, Banff.
Spoken Dialogue Eskenazi, M., 1998, User Come Back, DARPA Communicator Compare and Contrast Meeting, June 16-17, 1998. .pdf file
Ravishankar, M. and Eskenazi, M., 1997, Automatic Generation of Context-dependent Pronunciations, Proc. Eurospeech ’97, Rhodes, Greece, p. 2467 - 2470. .ps file
Placeway, P., Chen, S., Eskenazi, M., Jain, U., Parikh, V., Raj, B., Ravishankar, M., Rosenfeld, R., Seymore, K., Siegler, M., Stern, R., Thayer, E., 1997, The 1996 HUB-4 Sphinx-3 System, Proc, DARPA Speech Recognition Workshop, Chantilly, Virginia, Morgan Kaufmann Publishers.
Seymore, K., Chen, S., Eskenazi, M., Rosenfeld, R., (1997), Language and Pronunciation Modelling in the CMU 1996 HUB-4 Evaluation, Proc, DARPA Speech Recognition Workshop, Chantilly, Virginia, Morgan Kaufmann Publishers
Data collection and assessment Eskenazi, M., Rudnicky, A., Gregory, K., Constantinides, P., Brennan, R., Bennett, C., Allen, J., 1998, Data Collection and Processing in the Carnegie Mellon Communicator, in Proc. ESCA Eurospeech 98. . .pdf file
M. Eskenazi, 1996, KIDS: A Database of Children's Speech , in Proc. 3rd joint Meeting: Acoustical Societies of America and Japan, Honolulu.
Eskenazi, M., Hogan, C., Allen, J., Frederking, R., 1998, Issues in database design: Recording and processing speech from new populations, Proc. LREC Assessment and Database Workshop, Grenada, Spain.
Lamel, L., Gauvain, JL., Eskenazi, M., 1991, BREF, a Large Vocabulary Spoken Corpus for French, in Proc. EUROSPEECH-91
AFNOR, 1990, norme experimentale S 31-115, Evaluation de systemes de traitement automatique de la parole Partie 1: Definitions et methode d'evaluation de systemes de reconnaissance automatique de la parole - systemes de reconnaissance globale.
Cochlear Implants
Eskenazi, M., Vormes, E., Monguillot, G., Frachet, B., 1993, A new training and assessment technique for cochlear implants, in Advances in Cochlear Implants, Hochmair-Desoyer and Hochmair eds., International Science Seminars, Vienna, Austria, p. 572-577.
* Speech class: 11-752 Production, Prosody and Synthesis taught with Alan Black 11-752 course description * Language Technologies: 11-717 Language Technologies for Computer-Assisted Language Learning taught with Lori Levin and Teruko Mitamura 11-717 course description - Carnegie Speech Company
In 2001, Jaime Carbonell and I started the Carnegie Speech Company. The company produces software for teaching and assessing ESL. It has received funding from Innovation Works and from the state of Pennsylvania, and it has had SBIR grants from the US Department of Education and the National Science foundation. You can find out all about it at: www.carnegiespeech.com Here are some things that may be of interest to you.
2. AUTOMATIC SPEECH RECOGNIZER: The first time I used SPHINX II, I was astounded at how robust it could be. It is not perfect – none are, as we all know. But with understanding of the strong and weak points of the recognizer and some smart engineering, it is possible to modify it to perform nicely in well-defined applications (like Carnegie SpeechTM’s NativeAccentTM). It is also open source software and can be found at: http://www.cmusphinx.org . An important element in getting the recognizer to work well in a new application is to train it with data that is representative of the speakers who will use the application and the language they will use to express themselves. Carnegie SpeechTM sells licenses to YOUTH, a database of children’s speech that we put together during our Department of Education SBIR. This can be used to train the recognizer for applications for kids from about 6 to 11. Several commercial applications successfully use this data in their products.
3. AN AUTOMATIC SPEECH RECOGNITION (ASR) DIALOGUE SYSTEM: One of the precursors of our Let’s Go dialogue system and one of the best known is the Galaxy system from MIT. http://www.sls.csail.mit.edu/GALAXY.html 4. NEW FINDINGS IN LANGUAGE LEARNING: One of the most promising directions that I know of for language learning is the one that started with D. Pisoni and R. Yamada (ATR). They create pairs of sounds (R and L for Japanese learners of English) and acoustically “pull them apart” until the student can hear the difference between them. Students for whom this training works often can pronounce a new sound without having pronunciation training on it. At CMU. J. McClelland in the CNBC is working on this. You can check out: http://www.cnbc.cmu.edu/~jlm/papers/ 5. LANGUAGE TECHNOLOGIES FOR LANGUAGE LEARNING CONFERENCE: The most appropriate conference in this area is INSTiL. This conference took place in Venice Italy in June. Here is where you can find out more: http://project.cgm.unive.it/ You can look there to belong to the INSTiL special interest group. 6. AUTHORING NEW TUTORING SYSTEMS SUMMER SCHOOL: The great people who have made some of the most advanced and successful intelligent tutoring systems that exist hold a summer school each year where you can come and use their authoring tools to create your own tutor. You can find it here: http://www.isls.org/icls/summer.html
|
Awarded an Advanced Technology Program Grant from NIST |