SpeakEasy: Browser-based automated multi-language intonation training. (A)

Learning to speak a second language involves a cycle of observation, mimicry, and feedback. A student observes a teacher, attempts to copy the teacher's performance, and then the teacher provides feedback on how the student performed. When the students' access to feedback is limited, so is their ability reinforce their learning. In this work a web based application, Speakeasy tools, in introduced to provide remote students with automated visual intonation feedback for multiple languages. For this study, participants are selected from the pool of Speakeasy users and their interactions with the application are observed over a set period of time. The application presents participants with native speaker examples generated via text-to-speech in the form of audio samples, fundamental frequency visualizations, and grapheme and phoneme level timelines. Participants are able to record and review an unlimited number of practice attempts which are processed using the same pipeline used for the native speaker examples. Using the Root-Mean-Square Error (RMSE) between native and participant fundamental frequencies over time, the practice attempt is assigned a score. Practice-attempt scores with respect to time spent using the application provide a metric for measuring a participant's progress. [Work supported by RPI Seed Grant and CISL.]


"SpeakEasy: Browser-based automated multi-language intonation training. (A),"

The Journal of the Acoustical Society of America 148, no. 4 (2020): 2697.