Multimodal Error Recovery Demo Snapshots
We have built a prototypical automatic dictation system with multimodal error
recovery capabilities. The underlying recognizers are JANUS as continuous speech
recognizer for the Wall Street Journal task, NSpell and NPen++ as isolated
word handwriting and connected letter recognizers on 20 K vocabularies,
respectively.
In a typical interaction, the user will dictate a sentence, using a push-to-talk
button to activate recording. Upon completion of decoding, he corrects errors
one at a time by highlighting them, and then repeating the misrecognized word
using a modality of his choice (continuous speech, handwriting or spelling).
The following series of screen snapshots demonstrates deletion repair by gesture,
and substitution repair by handwriting.
Screen after the user pressed the SPEAK button and said "Republicans send a balanced budget plan to the senate floor" - three errors occured in continuous speech
decoding (insertion of "the", substitution of "send" by "and", and deletion of "a"):
The user deletes "the" by gesture (crossing out using a pen on the touch sensitive display):
After completion of delete repair:
The user highlights the error "and", automatically bringing up N-best choices:
Since the N-best doesn't contain the correct choice, the user repairs by handwriting:
After completion of handwrite repair:
(Patent Pending)
Maintained by:
bsuhm@cs.cmu.edu
last modified: 03 March 1997