Lip reading
Encyclopedia
Lip reading, also known as lipreading or speechreading, is a technique of understanding speech by visually interpreting the movements of the lip
s, face
and tongue
with information provided by the context, language, and any residual hearing.
speakers of a language are able to speechread to some extent (see McGurk effect
). Each speech sound (phoneme
) has a particular facial and mouth position (viseme
), although many phonemes share the same viseme and thus are impossible to distinguish from visual information alone. To appreciate how difficult lip reading is, and how much of the articulation of normal speech is not visible to an observer who cannot see "the inside of the speaker's head", it helps to watch an MRI video of a person speaking. When a normal person speaks, the tongue moves in at least 3 places (tip, middle and back), and the soft palate rises and falls. All of these articulatory gestures are phonetically significant, changing the speech sound produced in important ways, but are invisible to the lip reader.
Consequently, sounds whose place of articulation
is deep inside the mouth or throat are not detectable, such as glottal consonant
s. Voiced and unvoiced
pairs look identical, such as [p] and [b], [k] and [g], [t] and [d], [f] and [v], and [s] and [z] (American English); likewise for nasalisation. It has been estimated that only 30% to 40% of sounds in the English language are distinguishable from sight alone; the phrase "where there's life, there's hope" looks identical to "where's the lavender soap" in most English dialects. Author Henry Kisor titled his book What's That Pig Outdoors?: A Memoir of Deafness in reference to mishearing the question, "What's that big loud noise?" He used this example in the book to discuss the shortcomings of speechreading.
Thus a speechreader must use cues from the environment and a knowledge of what is likely to be said. It is much easier to speechread customary phrases such as greeting
s than utterances that appear in isolation and without supporting information, such as the name
of a person never met before. Speechreaders who have grown up deaf may never have heard the spoken language and are unlikely to be fluent users of it, which makes speechreading much more difficult. They must also learn the individual visemes by conscious training in an educational setting. In addition, speechreading takes a lot of focus, and can be extremely tiring. For these and other reasons, many deaf people prefer to use other means of communication with non-signers, such as mime and gesture, writing, and sign language
interpreters. When conversing with a speechreader, exaggerated mouthing of words is not considered to be helpful and may in fact obscure useful clues. However, it is possible to learn to emphasize useful clues; this is known as lip speaking.
Other difficult scenarios in which to speechread include:
Speechreading may be combined with cued speech
; one of the arguments in favor of the use of cued speech is that it helps develop lip reading skills that may be useful even when cues are absent, i.e., when communicating with non-deaf, non-hard of hearing people.
Quote from the Listening Eye, Dorothy Clegg, 1953, "When you are deaf you live inside a well-corked glass bottle. You see the entrancing outside world, but it does not reach you. After learning to lip read, you are still inside the bottle, but the cork has come out and the outside world slowly but surely comes in to you." This view is relatively controversial within the deaf world; see manualism and oralism
for an incomplete history of this debate.
Lip
Lips are a visible body part at the mouth of humans and many animals. Lips are soft, movable, and serve as the opening for food intake and in the articulation of sound and speech...
s, face
Face
The face is a central sense organ complex, for those animals that have one, normally on the ventral surface of the head, and can, depending on the definition in the human case, include the hair, forehead, eyebrow, eyelashes, eyes, nose, ears, cheeks, mouth, lips, philtrum, temple, teeth, skin, and...
and tongue
Tongue
The tongue is a muscular hydrostat on the floors of the mouths of most vertebrates which manipulates food for mastication. It is the primary organ of taste , as much of the upper surface of the tongue is covered in papillae and taste buds. It is sensitive and kept moist by saliva, and is richly...
with information provided by the context, language, and any residual hearing.
Process
People with normal vision, hearing and social skills sub-consciously use information from the lips and face to aid aural comprehension in everyday conversation, and most fluentFluency
Fluency is the property of a person or of a system that delivers information quickly and with expertise.-Speech:...
speakers of a language are able to speechread to some extent (see McGurk effect
McGurk effect
The McGurk effect is a perceptual phenomenon which demonstrates an interaction between hearing and vision in speech perception. "It is a compelling illusion in which humans perceive mismatched audiovisual speech as a completely different syllable". The visual information a person gets from seeing a...
). Each speech sound (phoneme
Phoneme
In a language or dialect, a phoneme is the smallest segmental unit of sound employed to form meaningful contrasts between utterances....
) has a particular facial and mouth position (viseme
Viseme
A viseme is a representational unit used to classify speech sounds in the visual domain. The term viseme was introduced based on the interpretation of the phoneme as a basic unit of speech in the acoustic/auditory domain,...
), although many phonemes share the same viseme and thus are impossible to distinguish from visual information alone. To appreciate how difficult lip reading is, and how much of the articulation of normal speech is not visible to an observer who cannot see "the inside of the speaker's head", it helps to watch an MRI video of a person speaking. When a normal person speaks, the tongue moves in at least 3 places (tip, middle and back), and the soft palate rises and falls. All of these articulatory gestures are phonetically significant, changing the speech sound produced in important ways, but are invisible to the lip reader.
Consequently, sounds whose place of articulation
Place of articulation
In articulatory phonetics, the place of articulation of a consonant is the point of contact where an obstruction occurs in the vocal tract between an articulatory gesture, an active articulator , and a passive location...
is deep inside the mouth or throat are not detectable, such as glottal consonant
Glottal consonant
Glottal consonants, also called laryngeal consonants, are consonants articulated with the glottis. Many phoneticians consider them, or at least the so-called fricative, to be transitional states of the glottis without a point of articulation as other consonants have; in fact, some do not consider...
s. Voiced and unvoiced
Voice (phonetics)
Voice or voicing is a term used in phonetics and phonology to characterize speech sounds, with sounds described as either voiceless or voiced. The term, however, is used to refer to two separate concepts. Voicing can refer to the articulatory process in which the vocal cords vibrate...
pairs look identical, such as [p] and [b], [k] and [g], [t] and [d], [f] and [v], and [s] and [z] (American English); likewise for nasalisation. It has been estimated that only 30% to 40% of sounds in the English language are distinguishable from sight alone; the phrase "where there's life, there's hope" looks identical to "where's the lavender soap" in most English dialects. Author Henry Kisor titled his book What's That Pig Outdoors?: A Memoir of Deafness in reference to mishearing the question, "What's that big loud noise?" He used this example in the book to discuss the shortcomings of speechreading.
Thus a speechreader must use cues from the environment and a knowledge of what is likely to be said. It is much easier to speechread customary phrases such as greeting
Greeting
Greeting is an act of communication in which human beings intentionally make their presence known to each other, to show attention to, and to suggest a type of relationship or social status between individuals or groups of people coming in contact with each other...
s than utterances that appear in isolation and without supporting information, such as the name
Name
A name is a word or term used for identification. Names can identify a class or category of things, or a single thing, either uniquely, or within a given context. A personal name identifies a specific unique and identifiable individual person, and may or may not include a middle name...
of a person never met before. Speechreaders who have grown up deaf may never have heard the spoken language and are unlikely to be fluent users of it, which makes speechreading much more difficult. They must also learn the individual visemes by conscious training in an educational setting. In addition, speechreading takes a lot of focus, and can be extremely tiring. For these and other reasons, many deaf people prefer to use other means of communication with non-signers, such as mime and gesture, writing, and sign language
Sign language
A sign language is a language which, instead of acoustically conveyed sound patterns, uses visually transmitted sign patterns to convey meaning—simultaneously combining hand shapes, orientation and movement of the hands, arms or body, and facial expressions to fluidly express a speaker's...
interpreters. When conversing with a speechreader, exaggerated mouthing of words is not considered to be helpful and may in fact obscure useful clues. However, it is possible to learn to emphasize useful clues; this is known as lip speaking.
Other difficult scenarios in which to speechread include:
- Lack of a clear view of the speaker's lips. This includes obstructions such as moustaches or hands in front of the mouth; the speaker's head turned aside or away; bright light source such as a window behind the speaker.
- Group discussions, especially when multiple people are talking in quick succession.
Speechreading may be combined with cued speech
Cued speech
Cued Speech is a system of communication used with and among deaf or hard of hearing people. It is a phonemic-based system which makes traditionally spoken languages accessible by using a small number of handshapes in different locations near the mouth , as a supplement to lipreading...
; one of the arguments in favor of the use of cued speech is that it helps develop lip reading skills that may be useful even when cues are absent, i.e., when communicating with non-deaf, non-hard of hearing people.
Quote from the Listening Eye, Dorothy Clegg, 1953, "When you are deaf you live inside a well-corked glass bottle. You see the entrancing outside world, but it does not reach you. After learning to lip read, you are still inside the bottle, but the cork has come out and the outside world slowly but surely comes in to you." This view is relatively controversial within the deaf world; see manualism and oralism
Manualism and oralism
Education of the deaf consists of two main approaches: manualism and oralism. Manualism is the education of deaf students using sign language and oralism is the education of deaf students using spoken language...
for an incomplete history of this debate.
See also
- Audio-visual speech recognitionAudio-visual speech recognitionAudio visual speech recognition is a technique that uses image processing capabilities in lip reading to aid speech recognition systems in recognizing undeterministic phones or giving preponderance among near probability decisions....
- Motor theory of speech perceptionMotor theory of speech perceptionthumb|250px|right|When we hear [[speech|spoken words]] we sense that they are made of auditory [[sound]]s. The motor theory of speech perception argues that behind the sounds we hear are the intended movements of the [[vocal tract]] that [[pronunciation|pronounces]] them.The motor theory of speech...
- MouthingMouthingIn sign language, mouthing is the production of visual syllables with the mouth while signing. Although not present in all sign languages, and sometimes not in signers at all levels of education, where it does occur it may be an essential element of a sign, distinguishing signs which would...
- Read My Lips (disambiguation)
- Reading (process)Reading (process)Reading is a complex cognitive process of decoding symbols for the intention of constructing or deriving meaning . It is a means of language acquisition, of communication, and of sharing information and ideas...
- Silent speech interfaceSilent speech interfaceSilent speech interface is a device that allows speech communication without using the sound made when people vocalize their speech sounds. As such it is a type of electronic lip reading. It works by the computer identifying the phonemes that an individual pronounces from nonauditory sources of...
- VentriloquismVentriloquismVentriloquism, or ventriloquy, is an act of stagecraft in which a person manipulates his or her voice so that it appears that the voice is coming from elsewhere, usually a puppeteered "dummy"...
- Visual captureVisual captureIn psychology, visual capture is the dominance of vision over other sense modalities in creating a percept. The phrase was coined by Frenchman J...
External links
- CSAIL: Articulatory Feature Based Visual Speech Recognition - To develop a visual speech recognition system that models visual speech in terms of the underlying articulatory processes.