Pseudoinverse - AbsoluteAstronomy.com

Mathematics

Mathematics is the study of quantity, space, structure, and change. Mathematicians seek out patterns and formulate new conjectures. Mathematicians resolve the truth or falsity of conjectures by mathematical proofs, which are arguments sufficient to convince other mathematicians of their validity...

, and in particular linear algebra

Linear algebra

Linear algebra is a branch of mathematics that studies vector spaces, also called linear spaces, along with linear functions that input one vector and output another. Such functions are called linear maps and can be represented by matrices if a basis is given. Thus matrix theory is often...

, a pseudoinverse of a matrix

Matrix (mathematics)

In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...

is a generalization

Generalized inverse

In mathematics, a generalized inverse or pseudoinverse of a matrix A is a matrix that has some properties of the inverse matrix of A but not necessarily all of them...

of the inverse matrix. The most widely known type of matrix pseudoinverse is the Moore–Penrose pseudoinverse, which was independently described by E. H. Moore

E. H. Moore

Eliakim Hastings Moore was an American mathematician.-Life:Moore, the son of a Methodist minister and grandson of US Congressman Eliakim H. Moore, discovered mathematics through a summer job at the Cincinnati Observatory while in high school. He learned mathematics at Yale University, where he was...

in 1920, Arne Bjerhammar

Arne Bjerhammar

Arne Bjerhammar was a Swedish geodesist. He was professor at Royal Institute of Technology in Stockholm. He was born in Båstad, Scania in the south of Sweden.His research covered many fields of geodesy...

in 1951 and Roger Penrose

Roger Penrose

Sir Roger Penrose OM FRS is an English mathematical physicist and Emeritus Rouse Ball Professor of Mathematics at the Mathematical Institute, University of Oxford and Emeritus Fellow of Wadham College...

in 1955. Earlier, Fredholm

Erik Ivar Fredholm

Erik Ivar Fredholm was a Swedish mathematician who established the modern theory of integral equations. His 1903 paper in Acta Mathematica is considered to be one of the major landmarks in the establishment of operator theory.The lunar crater Fredholm is named after him.-List of publications:* E.I...

had introduced the concept of a pseudoinverse of integral operators in 1903. When referred to a matrix, the term pseudoinverse, without further specification, is often used to indicate the Moore–Penrose pseudoinverse. The term generalized inverse

Generalized inverse

In mathematics, a generalized inverse or pseudoinverse of a matrix A is a matrix that has some properties of the inverse matrix of A but not necessarily all of them...

is sometimes used as a synonym for pseudoinverse.

A common use of the Moore–Penrose pseudoinverse (hereafter, just pseudoinverse) is to compute a 'best fit' (least squares

Linear least squares

In statistics and mathematics, linear least squares is an approach to fitting a mathematical or statistical model to data in cases where the idealized value provided by the model for any data point is expressed linearly in terms of the unknown parameters of the model...

) solution to a system of linear equations that lacks a unique solution (see below under Applications).
Another use is to find the minimum (Euclidean

Euclidean

Euclidean relates to Euclid , a town or others. It may refer to:Geometry...

) norm solution to a system of linear equations with multiple solutions.

The pseudoinverse is defined and unique for all matrices whose entries are real

Real number

In mathematics, a real number is a value that represents a quantity along a continuum, such as -5 , 4/3 , 8.6 , √2 and π...

or complex

Complex number

A complex number is a number consisting of a real part and an imaginary part. Complex numbers extend the idea of the one-dimensional number line to the two-dimensional complex plane by using the number line for the real part and adding a vertical axis to plot the imaginary part...

numbers. It can be computed using the singular value decomposition

Singular value decomposition

In linear algebra, the singular value decomposition is a factorization of a real or complex matrix, with many useful applications in signal processing and statistics....

Notation

In the following discussion, the following conventions are adopted.

will denote one of the fields of real or complex numbers, denoted , respectively. The ring of matrices over is denoted by .
For , and denote the transpose and Hermitian transpose (also called conjugate transpose
Conjugate transpose
In mathematics, the conjugate transpose, Hermitian transpose, Hermitian conjugate, or adjoint matrix of an m-by-n matrix A with complex entries is the n-by-m matrix A* obtained from A by taking the transpose and then taking the complex conjugate of each entry...

) respectively. If , then .
For , then denotes the range
Range (mathematics)
In mathematics, the range of a function refers to either the codomain or the image of the function, depending upon usage. This ambiguity is illustrated by the function f that maps real numbers to real numbers with f = x^2. Some books say that range of this function is its codomain, the set of all...

(image) of (the space spanned by the column vectors of ) and denotes the kernel (null space) of .
Finally, for any positive integer , denotes the identity matrix
Identity matrix
In linear algebra, the identity matrix or unit matrix of size n is the n×n square matrix with ones on the main diagonal and zeros elsewhere. It is denoted by In, or simply by I if the size is immaterial or can be trivially determined by the context...

.

Definition

For

,
a Moore–Penrose pseudoinverse (hereafter, just pseudoinverse) of

is defined as a matrix

satisfying all of the following four criteria:

( need not be the general identity matrix, but it maps all column vectors of to themselves);
( is a weak inverse for the multiplicative semigroup
Semigroup
In mathematics, a semigroup is an algebraic structure consisting of a set together with an associative binary operation. A semigroup generalizes a monoid in that there might not exist an identity element...

);
( is Hermitian); and
( is also Hermitian).

Properties

Proofs for some of these facts may be found on a separate page here.

Existence and uniqueness

The Moore–Penrose pseudoinverse exists and is unique: for any matrix , there is precisely one matrix , that satisfies the four properties of the definition.

A matrix satisfying the first two conditions of the definition is known as a generalized inverse

Generalized inverse

In mathematics, a generalized inverse or pseudoinverse of a matrix A is a matrix that has some properties of the inverse matrix of A but not necessarily all of them...

. Generalized inverses always exist but are not in general unique. Uniqueness is a consequence of the last two conditions.

Basic properties

If has real entries, then so does .
If is invertible, the pseudoinverse and the inverse coincide: .
The pseudoinverse of a zero matrix is its transpose.
The pseudoinverse of the pseudoinverse is the original matrix: .
Pseudoinversion commutes with transposition, conjugation, and taking the conjugate transpose:

The pseudoinverse of a scalar multiple of is the reciprocal multiple of :

for

Identities

The following identities can be used to cancel certain subexpressions or expand expressions involving pseudoinverses. Proofs for these properties can be found in the proofs subpage.

Reduction to Hermitian case

Products

and either,

has orthonormal columns (i.e. ) or,
has orthonormal rows (i.e. ) or,
has all columns linearly independent (full column rank) and has all rows linearly independent (full row rank),

then

Projectors

and

are
orthogonal projection operators

Projection (linear algebra)

In linear algebra and functional analysis, a projection is a linear transformation P from a vector space to itself such that P2 = P. It leaves its image unchanged....

i.e.

and

i.e. they are Hermitian and idempotent.

Then the following hold:

and
is the orthogonal projector onto the range
Range (mathematics)
In mathematics, the range of a function refers to either the codomain or the image of the function, depending upon usage. This ambiguity is illustrated by the function f that maps real numbers to real numbers with f = x^2. Some books say that range of this function is its codomain, the set of all...

of (which equals the orthogonal complement of the kernel of ).
is the orthogonal projector onto the range
Range (mathematics)
In mathematics, the range of a function refers to either the codomain or the image of the function, depending upon usage. This ambiguity is illustrated by the function f that maps real numbers to real numbers with f = x^2. Some books say that range of this function is its codomain, the set of all...

of (which equals the orthogonal complement of the kernel of ).
is the orthogonal projector onto the kernel of .
is the orthogonal projector onto the kernel of .

Subspaces

Limit relations

The pseudoinverse can also be computed via limiting processes:. These limits exist even if or do not exist.

Continuity
- In contrast to ordinary matrix inversion, the process of taking pseudoinverses is not continuous
  Continuous function
  In mathematics, a continuous function is a function for which, intuitively, "small" changes in the input result in "small" changes in the output. Otherwise, a function is said to be "discontinuous". A continuous function with a continuous inverse function is called "bicontinuous".Continuity of...
  
  : if the sequence converges to the matrix (in the maximum norm or Frobenius norm, say), then need not converge to . However, if all the matrices have the same rank, will converge to .
Scalars
It is also possible to define a pseudoinverse for scalars and vectors. This amounts to treating these as matrices. The pseudoinverse of a scalar is zero if is zero and the reciprocal of otherwise:

Vectors
The pseudoinverse of the null (all zero) vector is the transposed null vector. The pseudoinverse of a non-null vector is the conjugate transposed vector divided by its squared magnitude:

Linearly independent columns
If the columns of are linearly independent
Linear independence
In linear algebra, a family of vectors is linearly independent if none of them can be written as a linear combination of finitely many other vectors in the collection. A family of vectors which is not linearly independent is called linearly dependent...

(so that ), then
is invertible. In this case, an explicit formula is:.
It follows that is then a left inverse of
: .

Linearly independent rows
If the rows of are linearly independent (so that ), then
is invertible. In this case, an explicit formula is:.
It follows that is a right inverse of
: .

Orthonormal columns or rows
This is a special case of either full column rank or full row rank (treated above).
If has orthonormal columns ()
or orthonormal rows (),
then .

Circulant matrices
For a Circulant matrix
Circulant matrix
In linear algebra, a circulant matrix is a special kind of Toeplitz matrix where each row vector is rotated one element to the right relative to the preceding row vector. In numerical analysis, circulant matrices are important because they are diagonalized by a discrete Fourier transform, and hence...

, the singular value decomposition is given by the Fourier transform
Fourier transform
In mathematics, Fourier analysis is a subject area which grew from the study of Fourier series. The subject began with the study of the way general functions may be represented by sums of simpler trigonometric functions...

,
that is the singular values are the Fourier coefficients.
Let be the Discrete Fourier Transform (DFT) matrix
DFT matrix
A DFT matrix is an expression of a discrete Fourier transform as a matrix multiplication.-Definition:An N-point DFT is expressed as an N-by-N matrix multiplication as X = W x, where x is the original input signal, and X is the DFT of the signal.The transformation W of size N\times N can be defined...

, then

Rank decomposition
Let denote the rank of
. Then can be (rank) decomposed
Rank factorization
Given an m \times n matrix A of rank r, a rank decomposition or rank factorization of A is a product A=CF, where C is an m \times r matrix and F is an r \times n matrix....

as
where
and
are of rank .
Then .

The QR method
For or
computing the product or and their inverses explicitly is often a source of numerical rounding errors and computational cost in practice.
An alternative approach using the QR decomposition
QR decomposition
In linear algebra, a QR decomposition of a matrix is a decomposition of a matrix A into a product A=QR of an orthogonal matrix Q and an upper triangular matrix R...

of may be used instead.

Considering the case when is of full column rank, so that
. then the Cholesky decomposition
Cholesky decomposition
In linear algebra, the Cholesky decomposition or Cholesky triangle is a decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose. It was discovered by André-Louis Cholesky for real matrices...

,
where is an upper triangular matrix, may be used.
Multiplication by the inverse is then done easily by solving a system with multiple right-hand-sides,
which may be solved by forward substitution followed by back substitution.

The Cholesky decomposition may be computed without forming explicitly, by alternatively using the QR decomposition
QR decomposition
In linear algebra, a QR decomposition of a matrix is a decomposition of a matrix A into a product A=QR of an orthogonal matrix Q and an upper triangular matrix R...

of ,
where has orthonormal columns, , and
is upper triangular. Then,
so is the Cholesky factor of .

The case of full row rank is treated similarly by using the formula
and using a similar argument, swapping the roles of and
.

Singular value decomposition (SVD)
A computationally simple and accurate way to compute the pseudoinverse is by using the singular value decomposition
Singular value decomposition
In linear algebra, the singular value decomposition is a factorization of a real or complex matrix, with many useful applications in signal processing and statistics....

. If is the singular value decomposition of , then . For a diagonal matrix
Diagonal matrix
In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero. The diagonal entries themselves may or may not be zero...

such as , we get the pseudoinverse by taking the reciprocal of each non-zero element on the diagonal, leaving the zeros in place, and transposing the resulting matrix. In numerical computation, only elements larger than some small tolerance are taken to be nonzero, and the others are replaced by zeros. For example, in the MATLAB
MATLAB
MATLAB is a numerical computing environment and fourth-generation programming language. Developed by MathWorks, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages,...

or NumPy function pinv, the tolerance is taken to be , where ε is the machine epsilon
Machine epsilon
Machine epsilon gives an upper bound on the relative error due to rounding in floating point arithmetic. This value characterizes computer arithmetic in the field of numerical analysis, and by extension in the subject of computational science...

.

The computational cost of this method is dominated by the cost of computing the SVD, which is several times higher than matrix-matrix multiplication, even if a state-of-the art implementation (such as that of LAPACK
LAPACK
-External links:* : a modern replacement for PLAPACK and ScaLAPACK* on Netlib.org* * * : a modern replacement for LAPACK that is MultiGPU ready* on Sourceforge.net* * optimized LAPACK for Solaris OS on SPARC/x86/x64 and Linux* * *...

) is used.

The above procedure shows why taking the pseudoinverse is not a continuous operation: if the original matrix has a singular value 0 (a diagonal entry of the matrix above), then modifying slightly may turn this zero into a tiny positive number, thereby affecting the pseudoinverse dramatically as we now have to take the reciprocal of a tiny number.

Block matrices
Optimized approaches
Block matrix pseudoinverse
In mathematics, block matrix pseudoinverse is a formula of pseudoinverse of a partitioned matrix. This is useful for decomposing or approximating many algorithms updating parameters in signal processing, which are based on least squares method.- Derivation :...

exist for calculating the pseudoinverse of block structured matrices.

The iterative method of Ben-Israel and Cohen
Another method for computing the pseudoinverse uses the recursion
which is sometimes referred to as hyper-power sequence. This recursion produces a sequence converging quadratically to the pseudoinverse of if it is started with an appropriate satisfying . The choice (where , with denoting the largest singular value of ) has been argued not to be competitive to the method using the SVD mentioned above, because even for moderately ill-conditioned matrices it takes a long time before enters the region of quadratic convergence. However, if started with already close to the Moore–Penrose pseudoinverse and , for example , convergence is fast (quadratic).

Updating the pseudoinverse
For the cases where has full row or column rank, and the inverse of the correlation matrix ( for with full row rank or for full column rank) is already known, the pseudoinverse for matrices related to can be computed by applying the Sherman–Morrison–Woodbury formula to update the inverse of the correlation matrix, which may need less work. In particular, if the related matrix differs from the original one by only a changed, added or deleted row or column, incremental algorithms exist that exploit the relationship.

Similarly, it is possible to update the Cholesky factor when a row or column is added, without creating the inverse of the correlation matrix explicitly. However, updating the pseudoinverse in the general rank-deficient case is much more complicated.

Software libraries
The package NumPy provides a pseudo-inverse calculation through its functions matrix.I and linalg.pinv. High quality implementations of SVD, QR, and back substitution are available in standard libraries, such as LAPACK
LAPACK
-External links:* : a modern replacement for PLAPACK and ScaLAPACK* on Netlib.org* * * : a modern replacement for LAPACK that is MultiGPU ready* on Sourceforge.net* * optimized LAPACK for Solaris OS on SPARC/x86/x64 and Linux* * *...

. Writing one's own implementation of SVD is a major programming project that requires a significant numerical expertise. In special circumstances, such as parallel computing
Parallel computing
Parallel computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently . There are several different forms of parallel computing: bit-level,...

or embedded computing, however, alternative implementations by QR or even the use of an explicit inverse might be preferable, and custom implementations may be unavoidable.

Linear least-squares
The pseudoinverse provides a least squares
Linear least squares
In statistics and mathematics, linear least squares is an approach to fitting a mathematical or statistical model to data in cases where the idealized value provided by the model for any data point is expressed linearly in terms of the unknown parameters of the model...

solution to a system of linear equations.
For , given a system of linear equations,

in general, a vector which solves the system may not exist, or if one exists, it may not be unique. The pseudoinverse solves the "least-squares" problem as follows:
- , we have where and denotes the Euclidean norm. This weak inequality holds with equality if and only if for any vector w; this provides an infinitude of minimizing solutions unless A has full column rank, in which case is a zero matrix.
This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by
the Frobenius norm. Let .
- , we have where and denotes the Frobenius norm.
Obtaining all solutions of a linear system
If the linear system

has any solutions, they are all given by

for arbitrary vector w. Solution(s) exist if and only if . If the latter holds, then the solution is unique if and only if A has full column rank, in which case is a zero matrix.

Minimum-norm solution to a linear system
For linear systems with non-unique solutions (such as under-determined systems), the pseudoinverse may be used to construct the solution of minimum Euclidean norm
among all solutions.
- If is satisfiable, the vector is a solution, and satisfies for all solutions.
This result is easily extended to systems with multiple right-hand sides, when the Euclidean norm is replaced by
the Frobenius norm. Let .
- If is satisfiable, the matrix is a solution, and satisfies for all solutions.
Geometric construction
This description suggests the following geometric construction for the result of applying the pseudoinverse of an × matrix to a vector. To find for given in , first project orthogonally onto the range of , finding a point in the range. Then form , i.e. find those vectors in that sends to . This will be an affine subspace of parallel to the kernel of . The element of this subspace that has the smallest length (i.e. is closest to the origin) is the answer we are looking for. It can be found by taking an arbitrary member of and projecting it orthogonally onto the orthogonal complement of the kernel of .

Condition number
Using the pseudoinverse and a matrix norm, one can define a condition number
Condition number
In the field of numerical analysis, the condition number of a function with respect to an argument measures the asymptotically worst case of how much the function can change in proportion to small changes in the argument...

for any matrix:
A large condition number implies that the problem of finding least-squares solutions to the corresponding system of linear equations is ill-conditioned in the sense that small errors in the entries of can lead to huge errors in the entries of the solution.

Generalizations
In order to solve more general least-squares problems, one could try to define Moore–Penrose pseudoinverses for all continuous linear operators between two Hilbert space
Hilbert space
The mathematical concept of a Hilbert space, named after David Hilbert, generalizes the notion of Euclidean space. It extends the methods of vector algebra and calculus from the two-dimensional Euclidean plane and three-dimensional space to spaces with any finite or infinite number of dimensions...

s and , using the same four conditions as in our definition above. It turns out that not every continuous linear operator has a continuous linear pseudo-inverse in this sense. Those that do are precisely the ones whose range is closed
Closed set
In geometry, topology, and related branches of mathematics, a closed set is a set whose complement is an open set. In a topological space, a closed set can be defined as a set which contains all its limit points...

in .

In abstract algebra
Abstract algebra
Abstract algebra is the subject area of mathematics that studies algebraic structures, such as groups, rings, fields, modules, vector spaces, and algebras...

, a Moore–Penrose pseudoinverse may be defined on a *-regular semigroup. This abstract definition coincides with the one in linear algebra.

See also
- Proofs involving the Moore–Penrose pseudoinverse
- Drazin inverse
  Drazin inverse
  In mathematics, the Drazin inverse, named after Michael P. Drazin, is a kind of generalized inverse of a matrix.Let A be a square matrix. The index of A is the least nonnegative integer k such that rank = rank...
- Hat matrix
  Hat matrix
  In statistics, the hat matrix, H, maps the vector of observed values to the vector of fitted values. It describes the influence each observed value has on each fitted value...
- Inverse element
  Inverse element
  In abstract algebra, the idea of an inverse element generalises the concept of a negation, in relation to addition, and a reciprocal, in relation to multiplication. The intuition is of an element that can 'undo' the effect of combination with another given element...
- Linear least squares
  Linear least squares
  In statistics and mathematics, linear least squares is an approach to fitting a mathematical or statistical model to data in cases where the idealized value provided by the model for any data point is expressed linearly in terms of the unknown parameters of the model...
- Pseudo-determinant
  Pseudo-determinant
  In linear algebra and statistics, the pseudo-determinant is the product of all non-zero eigenvalues of a square matrix. It coincides with the regular determinant when the matrix is non-singular.- Definition :...
- Von Neumann regular ring
  Von Neumann regular ring
  In mathematics, a von Neumann regular ring is a ring R such that for every a in R there exists an x in R withOne may think of x as a "weak inverse" of a...
External links
- Pseudoinverse on PlanetMath
- Interactive program & tutorial of Moore-Penrose Pseudoinverse
The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.

Notation

Definition

Properties

Existence and uniqueness

Basic properties

Identities

Reduction to Hermitian case

Products

Projectors

Subspaces

Limit relations

Continuity

Scalars

Vectors

Linearly independent columns

Linearly independent rows

Orthonormal columns or rows

Circulant matrices

Rank decomposition

The QR method

Singular value decomposition (SVD)

Block matrices

The iterative method of Ben-Israel and Cohen

Updating the pseudoinverse

Software libraries

Linear least-squares

Obtaining all solutions of a linear system

Minimum-norm solution to a linear system

Geometric construction

Condition number

Generalizations

See also

External links