Cepstrum
Encyclopedia
A cepstrum ˈ is the result of taking the Fourier transform
(FT) of the logarithm
of the spectrum of a signal. There is a complex
cepstrum, a real
cepstrum, a power cepstrum, and phase cepstrum.
The power cepstrum in particular finds applications in the analysis of human speech.
The name "cepstrum" was derived by reversing the first four letters of "spectrum". Operations on cepstra are labelled quefrency alanysis, liftering, or cepstral analysis.
A short-time cepstrum analysis was proposed by Schroeder and Noll for application to pitch determination of human speech.
The complex cepstrum was defined by Oppenheim in his development of homomorphic system theory. It may be defined
The real cepstrum uses the logarithm
function
defined for real values. The real cepstrum is related to the power via the relationship (4 * real cepstrum)^2 = power cepstrum, and is related to the complex cepstrum as real cepstrum = 0.5*(complex cepstrum + time reversal of complex cepstrum).
The complex cepstrum uses the complex logarithm
function
defined for complex values.
The phase cepstrum is related to the complex cepstrum as phase spectrum = (complex cepstrum - time reversal of complex cepstrum).^2
The complex cepstrum holds information about magnitude and phase of the initial spectrum, allowing the reconstruction of the signal. The real cepstrum uses only the information of the magnitude of the spectrum.
Many texts define the process as FT → abs → log → IFT, i.e., that the cepstrum is the "inverse Fourier transform of the log-magnitude Fourier spectrum".
The kepstrum (which stands for "Kolmogorov equation power series time response") is similar to the cepstrum and has the same relation to it as statistical average has to expected value, i.e. cepstrum is the empirically measured quantity while kepstrum is the theoretical quantity.
resulting from earthquake
s and bomb
explosions. It has also been used to determine the fundamental frequency of human speech and to analyze radar
signal returns. Cepstrum pitch determination is particularly effective because the effects of the vocal excitation (pitch) and vocal tract (formants) are additive in the logarithm of the power spectrum and thus clearly separate.
The autocepstrum is defined as the cepstrum of the autocorrelation. The autocepstrum is more accurate than the cepstrum in the analysis of data with echoes.
The cepstrum is a representation used in homomorphic signal processing
, to convert signals (such as a source and filter) combined by convolution
into sums of their cepstra, for linear separation. In particular, the power cepstrum is often used as a feature vector for representing the human voice and musical signals. For these applications, the spectrum is usually first transformed using the mel scale
. The result is called the mel-frequency cepstrum or MFC (its coefficients are called mel-frequency cepstral coefficients, or MFCCs). It is used for voice identification, pitch detection
and much more. The cepstrum is useful in these applications because the low-frequency periodic excitation from the vocal cords and the formant
filtering of the vocal tract
, which convolve in the time domain
and multiply in the frequency domain
, are additive and in different regions in the quefrency domain.
of a cepstral graph is called the quefrency. The quefrency is a measure of time, though not in the sense of a signal in the time domain
. For example, if the sampling rate of an audio signal is 44100 Hz and there is a large peak in the cepstrum whose quefrency is 100 samples, the peak indicates the presence of a pitch that is 44100/100 = 441 Hz. This peak occurs in the cepstrum because the harmonics in the spectrum are periodic, and the period corresponds to the pitch. Note that a pure sine wave should not be used to test the cepstrum for its pitch determination from quefrency as a pure sine wave does not contain any harmonics. Rather, a test signal containing harmonics should be used (such as the sum of at least two sines where the second sine is some harmonic (multiple) of the first sine).
. It can be implemented by multiplying by a window in the cepstral domain and when converted back to the time domain, resulting in a smoother signal.
of two signals can be expressed as the addition of their cepstra:
Fourier transform
In mathematics, Fourier analysis is a subject area which grew from the study of Fourier series. The subject began with the study of the way general functions may be represented by sums of simpler trigonometric functions...
(FT) of the logarithm
Logarithm
The logarithm of a number is the exponent by which another fixed value, the base, has to be raised to produce that number. For example, the logarithm of 1000 to base 10 is 3, because 1000 is 10 to the power 3: More generally, if x = by, then y is the logarithm of x to base b, and is written...
of the spectrum of a signal. There is a complex
Complex number
A complex number is a number consisting of a real part and an imaginary part. Complex numbers extend the idea of the one-dimensional number line to the two-dimensional complex plane by using the number line for the real part and adding a vertical axis to plot the imaginary part...
cepstrum, a real
Real number
In mathematics, a real number is a value that represents a quantity along a continuum, such as -5 , 4/3 , 8.6 , √2 and π...
cepstrum, a power cepstrum, and phase cepstrum.
The power cepstrum in particular finds applications in the analysis of human speech.
The name "cepstrum" was derived by reversing the first four letters of "spectrum". Operations on cepstra are labelled quefrency alanysis, liftering, or cepstral analysis.
Origin and definition
The power cepstrum was defined in a 1963 paper by Bogert et al. It may be defined- verbally: the power cepstrum (of a signal) is the squared magnitude of the Fourier transform of the logarithm of the squared magnitude of the Fourier transform of a signal
- mathematically: power cepstrum of signal
- algorithmAlgorithmIn mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
ically: signal → FT → abs → square → log → FT → abs → square → power cepstrum
A short-time cepstrum analysis was proposed by Schroeder and Noll for application to pitch determination of human speech.
The complex cepstrum was defined by Oppenheim in his development of homomorphic system theory. It may be defined
- verbally: the complex cepstrum (of a signal) is the Fourier transform of the logarithm (with unwrapped phase) of the Fourier transform (of a signal). Sometimes called the spectrum of a spectrum.
- mathematically: complex cepstrum of signal = FT(log(|FT(the signal)|)+j2πm) (where m is the integer required to properly unwrap the angle or imaginary part of the complex log function)
- algorithmAlgorithmIn mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
ically: signal → FT → abs → log → phase unwrapping → FT → cepstrum
The real cepstrum uses the logarithm
Logarithm
The logarithm of a number is the exponent by which another fixed value, the base, has to be raised to produce that number. For example, the logarithm of 1000 to base 10 is 3, because 1000 is 10 to the power 3: More generally, if x = by, then y is the logarithm of x to base b, and is written...
function
Function (mathematics)
In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...
defined for real values. The real cepstrum is related to the power via the relationship (4 * real cepstrum)^2 = power cepstrum, and is related to the complex cepstrum as real cepstrum = 0.5*(complex cepstrum + time reversal of complex cepstrum).
The complex cepstrum uses the complex logarithm
Complex logarithm
In complex analysis, a complex logarithm function is an "inverse" of the complex exponential function, just as the natural logarithm ln x is the inverse of the real exponential function ex. Thus, a logarithm of z is a complex number w such that ew = z. The notation for such a w is log z...
function
Function (mathematics)
In mathematics, a function associates one quantity, the argument of the function, also known as the input, with another quantity, the value of the function, also known as the output. A function assigns exactly one output to each input. The argument and the value may be real numbers, but they can...
defined for complex values.
The phase cepstrum is related to the complex cepstrum as phase spectrum = (complex cepstrum - time reversal of complex cepstrum).^2
The complex cepstrum holds information about magnitude and phase of the initial spectrum, allowing the reconstruction of the signal. The real cepstrum uses only the information of the magnitude of the spectrum.
Many texts define the process as FT → abs → log → IFT, i.e., that the cepstrum is the "inverse Fourier transform of the log-magnitude Fourier spectrum".
The kepstrum (which stands for "Kolmogorov equation power series time response") is similar to the cepstrum and has the same relation to it as statistical average has to expected value, i.e. cepstrum is the empirically measured quantity while kepstrum is the theoretical quantity.
Applications
The cepstrum can be seen as information about rate of change in the different spectrum bands. It was originally invented for characterizing the seismic echoesEcho (phenomenon)
In audio signal processing and acoustics, an echo is a reflection of sound, arriving at the listener some time after the direct sound. Typical examples are the echo produced by the bottom of a well, by a building, or by the walls of an enclosed room and an empty room. A true echo is a single...
resulting from earthquake
Earthquake
An earthquake is the result of a sudden release of energy in the Earth's crust that creates seismic waves. The seismicity, seismism or seismic activity of an area refers to the frequency, type and size of earthquakes experienced over a period of time...
s and bomb
Bomb
A bomb is any of a range of explosive weapons that only rely on the exothermic reaction of an explosive material to provide an extremely sudden and violent release of energy...
explosions. It has also been used to determine the fundamental frequency of human speech and to analyze radar
Radar
Radar is an object-detection system which uses radio waves to determine the range, altitude, direction, or speed of objects. It can be used to detect aircraft, ships, spacecraft, guided missiles, motor vehicles, weather formations, and terrain. The radar dish or antenna transmits pulses of radio...
signal returns. Cepstrum pitch determination is particularly effective because the effects of the vocal excitation (pitch) and vocal tract (formants) are additive in the logarithm of the power spectrum and thus clearly separate.
The autocepstrum is defined as the cepstrum of the autocorrelation. The autocepstrum is more accurate than the cepstrum in the analysis of data with echoes.
The cepstrum is a representation used in homomorphic signal processing
Homomorphic filtering
Homomorphic filtering is a generalized technique for signal and image processing, involving a nonlinear mapping to a different domain in which linear filter techniques are applied, followed by mapping back to the original domain. This concept was developed in the 1960s by Thomas Stockham, Alan V....
, to convert signals (such as a source and filter) combined by convolution
Convolution
In mathematics and, in particular, functional analysis, convolution is a mathematical operation on two functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions. Convolution is similar to cross-correlation...
into sums of their cepstra, for linear separation. In particular, the power cepstrum is often used as a feature vector for representing the human voice and musical signals. For these applications, the spectrum is usually first transformed using the mel scale
Mel scale
The mel scale, named by Stevens, Volkman and Newman in 1937is a perceptual scale of pitches judged by listeners to be equal in distance from one another. The reference point between this scale and normal frequency measurement is defined by assigning a perceptual pitch of 1000 mels to a 1000 Hz...
. The result is called the mel-frequency cepstrum or MFC (its coefficients are called mel-frequency cepstral coefficients, or MFCCs). It is used for voice identification, pitch detection
Pitch detection algorithm
A pitch detection algorithm is an algorithm designed to estimate the pitch or fundamental frequency of a quasiperiodic or virtually periodic signal, usually a digital recording of speech or a musical note or tone. This can be done in the time domain or the frequency domain.PDAs are used in various...
and much more. The cepstrum is useful in these applications because the low-frequency periodic excitation from the vocal cords and the formant
Formant
Formants are defined by Gunnar Fant as 'the spectral peaks of the sound spectrum |P|' of the voice. In speech science and phonetics, formant is also used to mean an acoustic resonance of the human vocal tract...
filtering of the vocal tract
Vocal tract
The vocal tract is the cavity in human beings and in animals where sound that is produced at the sound source is filtered....
, which convolve in the time domain
Time domain
Time domain is a term used to describe the analysis of mathematical functions, physical signals or time series of economic or environmental data, with respect to time. In the time domain, the signal or function's value is known for all real numbers, for the case of continuous time, or at various...
and multiply in the frequency domain
Frequency domain
In electronics, control systems engineering, and statistics, frequency domain is a term used to describe the domain for analysis of mathematical functions or signals with respect to frequency, rather than time....
, are additive and in different regions in the quefrency domain.
Cepstral concepts
The independent variableDependent and independent variables
The terms "dependent variable" and "independent variable" are used in similar but subtly different ways in mathematics and statistics as part of the standard terminology in those subjects...
of a cepstral graph is called the quefrency. The quefrency is a measure of time, though not in the sense of a signal in the time domain
Time domain
Time domain is a term used to describe the analysis of mathematical functions, physical signals or time series of economic or environmental data, with respect to time. In the time domain, the signal or function's value is known for all real numbers, for the case of continuous time, or at various...
. For example, if the sampling rate of an audio signal is 44100 Hz and there is a large peak in the cepstrum whose quefrency is 100 samples, the peak indicates the presence of a pitch that is 44100/100 = 441 Hz. This peak occurs in the cepstrum because the harmonics in the spectrum are periodic, and the period corresponds to the pitch. Note that a pure sine wave should not be used to test the cepstrum for its pitch determination from quefrency as a pure sine wave does not contain any harmonics. Rather, a test signal containing harmonics should be used (such as the sum of at least two sines where the second sine is some harmonic (multiple) of the first sine).
Liftering
Playing further on the anagram theme, a filter that operates on a cepstrum might be called a lifter. A low pass lifter is similar to a low pass filter in the frequency domainFrequency domain
In electronics, control systems engineering, and statistics, frequency domain is a term used to describe the domain for analysis of mathematical functions or signals with respect to frequency, rather than time....
. It can be implemented by multiplying by a window in the cepstral domain and when converted back to the time domain, resulting in a smoother signal.
Convolution
A very important property of the cepstral domain is that the convolutionConvolution
In mathematics and, in particular, functional analysis, convolution is a mathematical operation on two functions f and g, producing a third function that is typically viewed as a modified version of one of the original functions. Convolution is similar to cross-correlation...
of two signals can be expressed as the addition of their cepstra:
Further reading
- D. G. Childers, D. P. Skinner, R. C. Kemerait, "The Cepstrum: A Guide to Processing," Proceedings of the IEEE, Vol. 65, No. 10, October 1977, pp. 1428–1443.
- "Speech Signal Analysis"
- "Speech analysis: Cepstral analysis vs. LPC www.advsolned.com"