Cauchy-Binet formula
Encyclopedia
In linear algebra
, the Cauchy–Binet formula, named after Augustin-Louis Cauchy and Jacques Philippe Marie Binet
, is an identity for the determinant
of the product
of two rectangular matrices
of transpose shapes (so that the product is well defined and square). It generalizes the statement that the determinant of a product of square matrices is equal to the product of their determinants. The formula is valid for matrices with entries from any commutative ring
.
s of [n] (i.e., subsets of size m; there are
of them). For , write A[m],S for the m×m matrix whose columns are the columns of A at indices from S, and BS,[m] for the m×m matrix whose rows are the rows of B at indices from S. The Cauchy–Binet formula then states
Example: taking m = 2 and n = 3, and matrices
and , the Cauchy–Binet formula gives the determinant:
Indeed , and its determinant is −28, which is also the value given by the right hand side of the formula
); indeed in this case the rank
of the m×m matrix AB is at most n, which implies that its determinant is zero. If n = m, the case where A and B are square matrices, (a singleton set), so the sum only involves S = [n], and the formula states that det(AB) = det(A)det(B).
For m = 0, A and B are empty matrices (but of different shapes if n > 0), as is their product AB; the summation involves a single term S = Ø, and the formula states 1 = 1, with both sides given by the determinant of the 0×0 matrix. For m = 1, the summation ranges over the collection of the n different singletons taken from [n], and both sides of the formula give , the dot product
of the pair of vector
s represented by the matrices. The smallest value of m for which the formula states a non-trivial equality is m = 2; it is discussed in the article on the Binet–Cauchy identity.
The formula can be proved in two steps:
For step 1, observe that for each row of A or column of B, and for each m-combination S, the values of det(AB) and det(A[m],S)det(BS,[m]) indeed depend linearly on the row or column. For the latter this is immediate from the multilinear property of the determinant; for the former one must in addition check that taking a linear combination for the row of A or column of B while leaving the rest unchanged only affects the corresponding row or column of the product AB, and by the same linear combination. Thus one can work out both sides of the Cauchy−Binet formula by linearity for every row of A and then also every column of B, writing each of the rows and columns as a linear combination of standard basis vectors. The resulting multiple summations are huge, but they have the same form for both sides: corresponding terms involve the same scalar factor (each is a product of entries of A and of B), and these terms only differ by involving two different expressions in terms of constant matrices of the kind described above, which expressions should be equal according to the Cauchy−Binet formula. This achieves the reduction of the first step.
Concretely, the multiple summations can be grouped into two summations, one over all functions f:[m] → [n] that for each row index of A gives a corresponding column index, and one over all functions g:[m] → [n] that for each column index of B gives a corresponding row index. The matrices associated to f and g are
where "" is the Kronecker delta, and the Cauchy−Binet formula to prove has been rewritten as
where p(f,g) denotes the scalar factor . It remains to prove the Cauchy−Binet formula for A = Lf and B = Rg, for all f,g:[m] → [n].
For this step 2, if f fails to be injective then Lf and LfRg both have two identical rows, and if g fails to be injective then Rg and LfRg both have two identical columns; in either case both sides of the identity are zero. Supposing now that both f and g are injective maps [m] → [n], the factor on the right is zero unless S = f([m]), while the factor is zero unless S = g([m]). So
if the images of f and g are different, the right hand side has only null terms, and the left hand side is zero as well since LfRg has a null row (for i with ). In the remaining case where the images of f and g are the same, say f([m]) = S = g([m]), we need to prove that
Let h be the unique increasing bijection [m] → S, and π,σ the permutations of [m] such that and ; then is the permutation matrix
for π, is the permutation matrix for σ, and LfRg is the permutation matrix for , and since the determinant of a permutation matrix equals the signature of the permutation, the identity follows from the fact that signatures are multiplicative.
Using multi-linearity with respect to both the rows of A and the columns of B in the proof is not necessary; one could use just one of them, say the former, and use that a matrix product LfB either consists of a permutation of the rows of Bf([m]),[m] (if f is injective), or has at least two equal rows.
spanned in Rn by the m rows of A. Binet's formula states that this is equal to the sum of the squares of the volumes that arise if the parallelepiped is orthogonally projected onto the m-dimensional coordinate planes (of which there are ).
In the case m = 1 the parallelepiped is reduced to a single vector and its volume its length. The above statement then states that the square of the length of a vector is the sum of the squares of its coordinates; this is indeed the case by the definition
of that length, which is based on the Pythagorean theorem
.
of the product of two matrices. That formula is given in the article on minors
.
Linear algebra
Linear algebra is a branch of mathematics that studies vector spaces, also called linear spaces, along with linear functions that input one vector and output another. Such functions are called linear maps and can be represented by matrices if a basis is given. Thus matrix theory is often...
, the Cauchy–Binet formula, named after Augustin-Louis Cauchy and Jacques Philippe Marie Binet
Jacques Philippe Marie Binet
Jacques Philippe Marie Binet was a French mathematician, physicist and astronomer born in Rennes; he died in Paris, France, in 1856. He made significant contributions to number theory, and the mathematical foundations of matrix algebra which would later lead to important contributions by Cayley...
, is an identity for the determinant
Determinant
In linear algebra, the determinant is a value associated with a square matrix. It can be computed from the entries of the matrix by a specific arithmetic expression, while other ways to determine its value exist as well...
of the product
Matrix multiplication
In mathematics, matrix multiplication is a binary operation that takes a pair of matrices, and produces another matrix. If A is an n-by-m matrix and B is an m-by-p matrix, the result AB of their multiplication is an n-by-p matrix defined only if the number of columns m of the left matrix A is the...
of two rectangular matrices
Matrix (mathematics)
In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...
of transpose shapes (so that the product is well defined and square). It generalizes the statement that the determinant of a product of square matrices is equal to the product of their determinants. The formula is valid for matrices with entries from any commutative ring
Commutative ring
In ring theory, a branch of abstract algebra, a commutative ring is a ring in which the multiplication operation is commutative. The study of commutative rings is called commutative algebra....
.
Statement
Let A be an m×n matrix and B an n×m matrix. Write [n] for the set { 1, ..., n }, and for the set of m-combinationCombination
In mathematics a combination is a way of selecting several things out of a larger group, where order does not matter. In smaller cases it is possible to count the number of combinations...
s of [n] (i.e., subsets of size m; there are
Binomial coefficient
In mathematics, binomial coefficients are a family of positive integers that occur as coefficients in the binomial theorem. They are indexed by two nonnegative integers; the binomial coefficient indexed by n and k is usually written \tbinom nk , and it is the coefficient of the x k term in...
of them). For , write A[m],S for the m×m matrix whose columns are the columns of A at indices from S, and BS,[m] for the m×m matrix whose rows are the rows of B at indices from S. The Cauchy–Binet formula then states
Example: taking m = 2 and n = 3, and matrices
and , the Cauchy–Binet formula gives the determinant:
Indeed , and its determinant is −28, which is also the value given by the right hand side of the formula
Special cases
If n < m then is the empty set, and the formula says that det(AB) = 0 (its right hand side is an empty sumEmpty sum
In mathematics, an empty sum, or nullary sum, is a summation involving no terms at all. The value of any empty sum of numbers is conventionally taken to be zero...
); indeed in this case the rank
Rank (linear algebra)
The column rank of a matrix A is the maximum number of linearly independent column vectors of A. The row rank of a matrix A is the maximum number of linearly independent row vectors of A...
of the m×m matrix AB is at most n, which implies that its determinant is zero. If n = m, the case where A and B are square matrices, (a singleton set), so the sum only involves S = [n], and the formula states that det(AB) = det(A)det(B).
For m = 0, A and B are empty matrices (but of different shapes if n > 0), as is their product AB; the summation involves a single term S = Ø, and the formula states 1 = 1, with both sides given by the determinant of the 0×0 matrix. For m = 1, the summation ranges over the collection of the n different singletons taken from [n], and both sides of the formula give , the dot product
Dot product
In mathematics, the dot product or scalar product is an algebraic operation that takes two equal-length sequences of numbers and returns a single number obtained by multiplying corresponding entries and then summing those products...
of the pair of vector
Tuple
In mathematics and computer science, a tuple is an ordered list of elements. In set theory, an n-tuple is a sequence of n elements, where n is a positive integer. There is also one 0-tuple, an empty sequence. An n-tuple is defined inductively using the construction of an ordered pair...
s represented by the matrices. The smallest value of m for which the formula states a non-trivial equality is m = 2; it is discussed in the article on the Binet–Cauchy identity.
Proof
There are various kinds of proofs that can be given for the Cauchy−Binet formula. The proof below is based on formal manipulations only, and avoids using any particular interpretation of determinants, which may be taken to be defined by the Leibniz formula. Only their multilinearity with respect to rows and columns, and their alternating property (vanishing in the presence of equal rows or columns) are used; in particular the multiplicative property of determinants for square matrices is not used, but is rather established (the case n = m). The proof is valid for arbitrary commutative coefficient rings.The formula can be proved in two steps:
- use the fact that both sides are multilinear (more precisely 2m-linear) in the rows of A and the columns of B, to reduce to the case that each row of A and each column of B has only one non-zero entry, which is 1.
- handle that case using the functions [m]→[n] that map respectively the row numbers of A to the column number of their nonzero entry, and the column numbers of B to the row number of their nonzero entry.
For step 1, observe that for each row of A or column of B, and for each m-combination S, the values of det(AB) and det(A[m],S)det(BS,[m]) indeed depend linearly on the row or column. For the latter this is immediate from the multilinear property of the determinant; for the former one must in addition check that taking a linear combination for the row of A or column of B while leaving the rest unchanged only affects the corresponding row or column of the product AB, and by the same linear combination. Thus one can work out both sides of the Cauchy−Binet formula by linearity for every row of A and then also every column of B, writing each of the rows and columns as a linear combination of standard basis vectors. The resulting multiple summations are huge, but they have the same form for both sides: corresponding terms involve the same scalar factor (each is a product of entries of A and of B), and these terms only differ by involving two different expressions in terms of constant matrices of the kind described above, which expressions should be equal according to the Cauchy−Binet formula. This achieves the reduction of the first step.
Concretely, the multiple summations can be grouped into two summations, one over all functions f:[m] → [n] that for each row index of A gives a corresponding column index, and one over all functions g:[m] → [n] that for each column index of B gives a corresponding row index. The matrices associated to f and g are
where "" is the Kronecker delta, and the Cauchy−Binet formula to prove has been rewritten as
where p(f,g) denotes the scalar factor . It remains to prove the Cauchy−Binet formula for A = Lf and B = Rg, for all f,g:[m] → [n].
For this step 2, if f fails to be injective then Lf and LfRg both have two identical rows, and if g fails to be injective then Rg and LfRg both have two identical columns; in either case both sides of the identity are zero. Supposing now that both f and g are injective maps [m] → [n], the factor on the right is zero unless S = f([m]), while the factor is zero unless S = g([m]). So
if the images of f and g are different, the right hand side has only null terms, and the left hand side is zero as well since LfRg has a null row (for i with ). In the remaining case where the images of f and g are the same, say f([m]) = S = g([m]), we need to prove that
Let h be the unique increasing bijection [m] → S, and π,σ the permutations of [m] such that and ; then is the permutation matrix
Permutation matrix
In mathematics, in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry 1 in each row and each column and 0s elsewhere...
for π, is the permutation matrix for σ, and LfRg is the permutation matrix for , and since the determinant of a permutation matrix equals the signature of the permutation, the identity follows from the fact that signatures are multiplicative.
Using multi-linearity with respect to both the rows of A and the columns of B in the proof is not necessary; one could use just one of them, say the former, and use that a matrix product LfB either consists of a permutation of the rows of Bf([m]),[m] (if f is injective), or has at least two equal rows.
Geometric interpretations
If A is a real m×n matrix, then det(A AT) is equal to the square of the m-dimensional volume of the parallelepipedParallelepiped
In geometry, a parallelepiped is a three-dimensional figure formed by six parallelograms. By analogy, it relates to a parallelogram just as a cube relates to a square. In Euclidean geometry, its definition encompasses all four concepts...
spanned in Rn by the m rows of A. Binet's formula states that this is equal to the sum of the squares of the volumes that arise if the parallelepiped is orthogonally projected onto the m-dimensional coordinate planes (of which there are ).
In the case m = 1 the parallelepiped is reduced to a single vector and its volume its length. The above statement then states that the square of the length of a vector is the sum of the squares of its coordinates; this is indeed the case by the definition
Euclidean distance
In mathematics, the Euclidean distance or Euclidean metric is the "ordinary" distance between two points that one would measure with a ruler, and is given by the Pythagorean formula. By using this formula as distance, Euclidean space becomes a metric space...
of that length, which is based on the Pythagorean theorem
Pythagorean theorem
In mathematics, the Pythagorean theorem or Pythagoras' theorem is a relation in Euclidean geometry among the three sides of a right triangle...
.
Generalization
The Cauchy–Binet formula can be extended in a straightforward way to a general formula for the minorsMinor (linear algebra)
In linear algebra, a minor of a matrix A is the determinant of some smaller square matrix, cut down from A by removing one or more of its rows or columns...
of the product of two matrices. That formula is given in the article on minors
Minor (linear algebra)
In linear algebra, a minor of a matrix A is the determinant of some smaller square matrix, cut down from A by removing one or more of its rows or columns...
.