Rayleigh quotient
Encyclopedia
In mathematics
, for a given complex Hermitian matrix and nonzero vector , the Rayleigh quotient , is defined as:
For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugate transpose
to the usual transpose
. Note that for any real scalar . Recall that a Hermitian (or real symmetric) matrix has real eigenvalues. It can be shown that, for a given matrix, the Rayleigh quotient reaches its minimum value (the smallest eigenvalue of ) when is (the corresponding eigenvector). Similarly, and . The Rayleigh quotient is used in min-max theorem
to get exact values of all eigenvalues. It is also used in eigenvalue algorithm
s to obtain an eigenvalue approximation from an eigenvector approximation. Specifically, this is the basis for Rayleigh quotient iteration
.
The range of the Rayleigh quotient is called a numerical range
.
M can be represented as the product . Its eigenvalues are positive:
The eigenvectors are orthogonal to one another:
(different eigenvalues, in case of multiplicity, the basis can be orthogonalized).
The Rayleigh quotient can be expressed as a function of the eigenvalues by decomposing any vector on the basis of eigenvectors:, where is the coordinate of x orthogonally projected onto
which, by orthogonality of the eigenvectors, becomes:
In the last representation we can see that the Rayleigh quotient is the sum of the squared cosines of the angles formed by the vector x and each eigenvector , weighted by corresponding eigenvalues.
If a vector maximizes , then any vector (for ) also maximizes it, one can reduce to the Lagrange problem
of maximizing under the constraint that .
Since all the eigenvalues are non-negative, the problem is convex
and the maximum occurs on the edges of the domain, namely when and (when the eigenvalues are ordered in decreasing magnitude).
Alternatively, this result can be arrived at by the method of Lagrange multipliers
. The problem is to find the critical points
of the function
,
subject to the constraint
I.e. to find the critical points of
where is a Lagrange multiplier. The stationary points of occur at
and
Therefore, the eigenvectors of M are the critical points of the Raleigh Quotient and their corresponding eigenvalues are the stationary values of R.
This property is the basis for principal components analysis
and canonical correlation
.
on the inner product space
defined by
of functions satisfying some specified boundary conditions at a and b. In this case the Rayleigh quotient is
This is sometimes presented in an equivalent form, obtained by separating the integral in the numerator and using integration by parts
:
The Generalized Rayleigh Quotient can be reduced to the Rayleigh Quotient through the transformation where is the Cholesky decomposition
of matrix .
Mathematics
Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...
, for a given complex Hermitian matrix and nonzero vector , the Rayleigh quotient , is defined as:
For real matrices and vectors, the condition of being Hermitian reduces to that of being symmetric, and the conjugate transpose
Conjugate transpose
In mathematics, the conjugate transpose, Hermitian transpose, Hermitian conjugate, or adjoint matrix of an m-by-n matrix A with complex entries is the n-by-m matrix A* obtained from A by taking the transpose and then taking the complex conjugate of each entry...
to the usual transpose
Transpose
In linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...
. Note that for any real scalar . Recall that a Hermitian (or real symmetric) matrix has real eigenvalues. It can be shown that, for a given matrix, the Rayleigh quotient reaches its minimum value (the smallest eigenvalue of ) when is (the corresponding eigenvector). Similarly, and . The Rayleigh quotient is used in min-max theorem
Min-max theorem
In linear algebra and functional analysis, the min-max theorem, or variational theorem, or Courant–Fischer–Weyl min-max principle, is a result that gives a variational characterization of eigenvalues of compact Hermitian operators on Hilbert spaces...
to get exact values of all eigenvalues. It is also used in eigenvalue algorithm
Eigenvalue algorithm
In linear algebra, one of the most important problems is designing efficient and stable algorithms for finding the eigenvalues of a matrix. These eigenvalue algorithms may also find eigenvectors.-Characteristic polynomial:...
s to obtain an eigenvalue approximation from an eigenvector approximation. Specifically, this is the basis for Rayleigh quotient iteration
Rayleigh quotient iteration
Rayleigh quotient iteration is an eigenvalue algorithm which extends the idea of the inverse iteration by using the Rayleigh quotient to obtain increasingly accurate eigenvalue estimates....
.
The range of the Rayleigh quotient is called a numerical range
Numerical range
In the mathematical field of linear algebra and convex analysis, the numerical range of a square matrix with complex entries is a subset of the complex plane associated to the matrix...
.
Special case of covariance matrices
A covariance matrixCovariance matrix
In probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...
M can be represented as the product . Its eigenvalues are positive:
The eigenvectors are orthogonal to one another:
(different eigenvalues, in case of multiplicity, the basis can be orthogonalized).
The Rayleigh quotient can be expressed as a function of the eigenvalues by decomposing any vector on the basis of eigenvectors:, where is the coordinate of x orthogonally projected onto
which, by orthogonality of the eigenvectors, becomes:
In the last representation we can see that the Rayleigh quotient is the sum of the squared cosines of the angles formed by the vector x and each eigenvector , weighted by corresponding eigenvalues.
If a vector maximizes , then any vector (for ) also maximizes it, one can reduce to the Lagrange problem
Lagrange multipliers
In mathematical optimization, the method of Lagrange multipliers provides a strategy for finding the maxima and minima of a function subject to constraints.For instance , consider the optimization problem...
of maximizing under the constraint that .
Since all the eigenvalues are non-negative, the problem is convex
Convex function
In mathematics, a real-valued function f defined on an interval is called convex if the graph of the function lies below the line segment joining any two points of the graph. Equivalently, a function is convex if its epigraph is a convex set...
and the maximum occurs on the edges of the domain, namely when and (when the eigenvalues are ordered in decreasing magnitude).
Alternatively, this result can be arrived at by the method of Lagrange multipliers
Lagrange multipliers
In mathematical optimization, the method of Lagrange multipliers provides a strategy for finding the maxima and minima of a function subject to constraints.For instance , consider the optimization problem...
. The problem is to find the critical points
Critical point (mathematics)
In calculus, a critical point of a function of a real variable is any value in the domain where either the function is not differentiable or its derivative is 0. The value of the function at a critical point is a critical value of the function...
of the function
,
subject to the constraint
I.e. to find the critical points of
where is a Lagrange multiplier. The stationary points of occur at
and
Therefore, the eigenvectors of M are the critical points of the Raleigh Quotient and their corresponding eigenvalues are the stationary values of R.
This property is the basis for principal components analysis
Principal components analysis
Principal component analysis is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of uncorrelated variables called principal components. The number of principal components is less than or equal to...
and canonical correlation
Canonical correlation
In statistics, canonical correlation analysis, introduced by Harold Hotelling, is a way of making sense of cross-covariance matrices. If we have two sets of variables, x_1, \dots, x_n and y_1, \dots, y_m, and there are correlations among the variables, then canonical correlation analysis will...
.
Use in Sturm–Liouville theory
Sturm–Liouville theory concerns the action of the linear operatoron the inner product space
Inner product space
In mathematics, an inner product space is a vector space with an additional structure called an inner product. This additional structure associates each pair of vectors in the space with a scalar quantity known as the inner product of the vectors...
defined by
of functions satisfying some specified boundary conditions at a and b. In this case the Rayleigh quotient is
This is sometimes presented in an equivalent form, obtained by separating the integral in the numerator and using integration by parts
Integration by parts
In calculus, and more generally in mathematical analysis, integration by parts is a rule that transforms the integral of products of functions into other integrals...
:
Generalization
For a given pair of real symmetric positive-definite matrices, and a given non-zero vector , the generalized Rayleigh quotient is defined as:The Generalized Rayleigh Quotient can be reduced to the Rayleigh Quotient through the transformation where is the Cholesky decomposition
Cholesky decomposition
In linear algebra, the Cholesky decomposition or Cholesky triangle is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose. It was discovered by André-Louis Cholesky for real matrices...
of matrix .
Further reading
- Shi Yu, Léon-Charles Tranchevent, Bart Moor, Yves Moreau, Kernel-based Data Fusion for Machine Learning: Methods and Applications in Bioinformatics and Text Mining, Ch. 2, Springer, 2011.