ABSTRACT
Speech Dasher allows writing using a combination of speech and a zooming interface. Users first speak what they want to write and then they navigate through the space of recognition hypotheses to correct any errors. Speech Dasher's model combines information from a speech recognizer, from the user, and from a letter-based language model. This allows fast writing of anything predicted by the recognizer while also providing seamless fallback to letter-by-letter spelling for words not in the recognizer's predictions. In a formative user study, expert users wrote at 40 (corrected) words per minute. They did this despite a recognition word error rate of 22%. Furthermore, they did this using only speech and the direction of their gaze (obtained via an eye tracker).
Supplemental Material
- D. Huggins-Daines, M. Kumar, A. Chan, A. W. Black, M. Ravishankar, and A. I. Rudnicky. PocketSphinx: A free, real-time continuous speech recognition system for hand-held devices. In Proc. of ICASSP, 185--188, 2006.Google ScholarCross Ref
- C.-M. Karat, C. Halverson, D. Horn, and J. Karat. Patterns of entry and correction in large vocabulary continuous speech recognition systems. In Proc. of CHI, 568--575, 1999. Google ScholarDigital Library
- K. Larson and D. Mowatt. Speech error correction: The story of the alternates list. International Journal of Speech Technology, 183--194, 2003.Google ScholarCross Ref
- S. Oviatt. Taming recognition errors with a multimodal interface. Comm. of the ACM, 43(9):45--51, 2000. Google ScholarDigital Library
- B. Suhm, B. Myers, and A. Waibel. Multimodal error correction for speech user interfaces. ACM Transactions on Computer-Human Interaction, 8(1):60--98, 2001. Google ScholarDigital Library
- D. J. Ward, A. F. Blackwell, and D. J. C. MacKay. Dasher - a data entry interface using continuous gestures and language models. In Proc. of UIST, 129--137, 2000. Google ScholarDigital Library
- D. J. Ward and D. J. C. MacKay. Fast hands-free writing by gaze direction. Nature, 418(6900):838, 2002.Google ScholarCross Ref
Index Terms
- Speech dasher: fast writing using speech and gaze
Recommendations
Speech dasher: a demonstration of text input using speech and approximate pointing
ASSETS '14: Proceedings of the 16th international ACM SIGACCESS conference on Computers & accessibilitySpeech Dasher is a novel text entry interface in which users first speak their desired text and then use the zooming interface Dasher to confirm and correct the recognition result. After several hours of practice, users wrote using Speech Dasher at 40 (...
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System
Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Syllable-based automatic arabic speech recognition in noisy-telephone channel
The performance of well-trained speech recognizers using high quality full bandwidth speech data is usually degraded when used in real world environments. In particular, telephone speech recognition is extremely difficult due to the limited bandwidth of ...
Comments