Computational auditory scene analysis
Encyclopedia
Computational auditory scene analysis (CASA) is the study of auditory scene analysis
by computational means . In essence, CASA systems are "machine listening" systems that aim to separate mixtures of sound sources in the same way that human listeners do. CASA differs from the field of blind signal separation
in that it is (at least to some extent) based on the mechanisms of the human auditory system
, and thus uses no more than two microphone recordings of an acoustic environment. It is related to the cocktail party problem.
Auditory scene analysis
In psychophysics, auditory scene analysis is the process by which the human auditory system organizes sound into perceptually meaningful elements. The term was coined by psychologist Albert Bregman...
by computational means . In essence, CASA systems are "machine listening" systems that aim to separate mixtures of sound sources in the same way that human listeners do. CASA differs from the field of blind signal separation
Blind signal separation
Blind signal separation, also known as blind source separation, is the separation of a set of signals from a set of mixed signals, without the aid of information about the source signals or the mixing process....
in that it is (at least to some extent) based on the mechanisms of the human auditory system
Auditory system
The auditory system is the sensory system for the sense of hearing.- Outer ear :The folds of cartilage surrounding the ear canal are called the pinna...
, and thus uses no more than two microphone recordings of an acoustic environment. It is related to the cocktail party problem.
Applications
Potentially, CASA technology can be applied in the following areas:- Robust automatic speech recognitionSpeech recognitionSpeech recognition converts spoken words to text. The term "voice recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker—as is the case for most desktop recognition software...
and speaker recognitionSpeaker recognitionSpeaker recognition is the computing task of validating a user's claimed identity using characteristics extracted from their voices .There is a difference between speaker recognition and speech recognition . These two terms are frequently confused, as is voice recognition...
in noisy environments - Automatic transcription of musical audio recordings
- Hearing aids
See also
- auditory scene analysisAuditory scene analysisIn psychophysics, auditory scene analysis is the process by which the human auditory system organizes sound into perceptually meaningful elements. The term was coined by psychologist Albert Bregman...
- machine visionMachine visionMachine vision is the process of applying a range of technologies and methods to provide imaging-based automatic inspection, process control and robot guidance in industrial applications. While the scope of MV is broad and a comprehensive definition is difficult to distil, a "generally accepted...
- blind signal separationBlind signal separationBlind signal separation, also known as blind source separation, is the separation of a set of signals from a set of mixed signals, without the aid of information about the source signals or the mixing process....
- cocktail party problem