Covariance
Encyclopedia
In probability theory
and statistics
, covariance is a measure of how much two variables change together. Variance
is a special case of the covariance when the two variables are identical.
-valued random variable
s X and Y with finite second moments is
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...
and statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, covariance is a measure of how much two variables change together. Variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
is a special case of the covariance when the two variables are identical.
Definition
The covariance between two realReal number
In mathematics, a real number is a value that represents a quantity along a continuum, such as -5 , 4/3 , 8.6 , √2 and π...
-valued random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...
s X and Y with finite second moments is
-
where E[X] is the expected valueExpected valueIn probability theory, the expected value of a random variable is the weighted average of all possible values that this random variable can take on...
of X. By using some properties of expectations, this can be simplified to
-
For random vectors X and Y (of dimension m and n respectively) the m×n covariance matrixCovariance matrixIn probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...
is equal to
-
where MT is the transposeTransposeIn linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...
of a matrix (or vector) M.
The (i,j)-th element of this matrix is equal to the covariance Cov(Xi, Yj) between the i-th scalar component of X and the j-th scalar component of Y. In particular, Cov(Y, X) is the transposeTransposeIn linear algebra, the transpose of a matrix A is another matrix AT created by any one of the following equivalent actions:...
of Cov(X, Y).
Random variables whose covariance is zero are called uncorrelatedUncorrelatedIn probability theory and statistics, two real-valued random variables are said to be uncorrelated if their covariance is zero. Uncorrelatedness is by definition pairwise; i.e...
.
The units of measurement of the covariance Cov(X, Y) are those of X times those of Y. By contrast, correlationCorrelationIn statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....
, which depends on the covariance, is a dimensionless measure of linear dependence.
Properties
If X, Y, W, and V are real-valued random variables and a, b, c, d are constant ("constant" in this context means non-random), then the following facts are a consequence of the definition of covariance:
-
For sequences X1, ..., Xn and Y1, ..., Ym of random variables, we have
For a sequence X1, ..., Xn of random variables, and constants a1, ..., an, we have
Uncorrelatedness and independence
If X and Y are independentStatistical independenceIn probability theory, to say that two events are independent intuitively means that the occurrence of one event makes it neither more nor less probable that the other occurs...
, then their covariance is zero. This follows because under independence,
The converse, however, is generally not true: Some pairs of random variables have covariance zero although they are not independent.
In order to understand how the converse of this proposition is not generally true, consider the example where Y = X2, E[X] = 0, and E[X3] = 0. In this case, X and Y are obviously not independently distributed.
-
Relationship to inner products
Many of the properties of covariance can be extracted elegantly by observing that it satisfies similar properties to those of an inner product:- bilinearBilinear operatorIn mathematics, a bilinear operator is a function combining elements of two vector spaces to yield an element of a third vector space that is linear in each of its arguments. Matrix multiplication is an example.-Definition:...
: for constants a and b and random variables X, Y, and U, Cov(aX + bY, U) = a Cov(X, U) + b Cov(Y, U) - symmetric: Cov(X, Y) = Cov(Y, X)
- positive semi-definiteDefinite bilinear formIn mathematics, a definite bilinear form is a bilinear form B over some vector space V such that the associated quadratic formQ=B \,...
: Var(X) = Cov(X, X) ≥ 0, and Cov(X, X) = 0 implies that X is a constant random variable (K).
In fact these properties imply that the covariance defines an inner product over the quotient vector spaceQuotient space (linear algebra)In linear algebra, the quotient of a vector space V by a subspace N is a vector space obtained by "collapsing" N to zero. The space obtained is called a quotient space and is denoted V/N ....
obtained by taking the subspace of random variables with finite second moment and identifying any two that differ by a constant. (This identification turns the positive semi-definiteness above into positive definiteness.) That quotient vector space is isomorphic to the subspace of random variables with finite second moment and mean zero; on that subspace, the covariance is exactly the L2Lp spaceIn mathematics, the Lp spaces are function spaces defined using a natural generalization of the p-norm for finite-dimensional vector spaces...
inner product of real-valued functions on the sample space.
As a result for random variables with finite variance the following inequality holds via the Cauchy–Schwarz inequalityCauchy–Schwarz inequalityIn mathematics, the Cauchy–Schwarz inequality , is a useful inequality encountered in many different settings, such as linear algebra, analysis, probability theory, and other areas...
:
Proof: If Var(Y) = 0, then it holds trivially. Otherwise, let random variable
Then we have:
-
QED.
Calculating the sample covariance
The sample covariance of N observations of K variables is the K-by-K matrixMatrix (mathematics)In mathematics, a matrix is a rectangular array of numbers, symbols, or expressions. The individual items in a matrix are called its elements or entries. An example of a matrix with six elements isMatrices of the same size can be added or subtracted element by element...
with the entries given by
The sample mean and the sample covariance matrix are unbiased estimatesBias of an estimatorIn statistics, bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.In ordinary English, the term bias is...
of the meanMeanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
and the covariance matrixCovariance matrixIn probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...
of the random vector , a row vector whose jth element (j = 1, ..., K) is one of the random variables. The reason the sample covariance matrix has in the denominator rather than is essentially that the population mean is not known and is replaced by the sample mean . If the population mean is known, the analogous unbiased estimate
Comments
The covariance is sometimes called a measure of "linear dependence" between the two random variables. That does not mean the same thing as in the context of linear algebraLinear algebraLinear algebra is a branch of mathematics that studies vector spaces, also called linear spaces, along with linear functions that input one vector and output another. Such functions are called linear maps and can be represented by matrices if a basis is given. Thus matrix theory is often...
(see linear dependence). When the covariance is normalized, one obtains the correlation matrix. From it, one can obtain the Pearson coefficient, which gives us the goodness of the fit for the best possible linear function describing the relation between the variables. In this sense covariance is a linear gauge of dependence.
See also
- Covariance functionCovariance functionIn probability theory and statistics, covariance is a measure of how much two variables change together and the covariance function describes the variance of a random variable process or field...
- Covariance matrixCovariance matrixIn probability theory and statistics, a covariance matrix is a matrix whose element in the i, j position is the covariance between the i th and j th elements of a random vector...
- Covariance operatorCovariance operatorin probability theory, for a probability measure P on a Hilbert space H with inner product \langle \cdot,\cdot\rangle , the covariance of P is the bilinear form Cov: H × H → R given by...
- CorrelationCorrelationIn statistics, dependence refers to any statistical relationship between two random variables or two sets of data. Correlation refers to any of a broad class of statistical relationships involving dependence....
- Eddy covarianceEddy covarianceThe eddy covariance technique is a key atmospheric flux measurement technique to measure and calculate vertical turbulent fluxes within atmospheric boundary layers...
- Law of total covarianceLaw of total covarianceIn probability theory, the law of total covariance or covariance decomposition formula states that if X, Y, and Z are random variables on the same probability space, and the covariance of X and Y is finite, then...
- AutocovarianceAutocovarianceIn statistics, given a real stochastic process X, the autocovariance is the covariance of the variable with itself, i.e. the variance of the variable against a time-shifted version of itself...
- Analysis of covariance
- Algorithms for calculating variance#Covariance
External links
- Covariance function
- bilinear
-
-
-