Audio-visual speech recognition
Encyclopedia
Audio visual speech recognition (AVSR) is a technique that uses image processing
capabilities in lip reading
to aid speech recognition
systems in recognizing undeterministic phones or giving preponderance among near probability decisions.
Each system lip reading
and speech recognition
works separately then their results are mixed at the stage of feature fusion.
Image processing
In electrical engineering and computer science, image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image...
capabilities in lip reading
Lip reading
Lip reading, also known as lipreading or speechreading, is a technique of understanding speech by visually interpreting the movements of the lips, face and tongue with information provided by the context, language, and any residual hearing....
to aid speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...
systems in recognizing undeterministic phones or giving preponderance among near probability decisions.
Each system lip reading
Lip reading
Lip reading, also known as lipreading or speechreading, is a technique of understanding speech by visually interpreting the movements of the lips, face and tongue with information provided by the context, language, and any residual hearing....
and speech recognition
Speech recognition
Speech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...
works separately then their results are mixed at the stage of feature fusion.