Quantum relative entropy
Encyclopedia
In quantum information theory, quantum relative entropy is a measure of distinguishability between two quantum states
. It is the quantum mechanical analog of relative entropy.
We first discuss the classical case. Suppose the probabilities of a finite sequence of events is given by the probability distribution P = {p1...pn}, but somehow we mistakenly assumed it to be Q = {q1...qn}. For instance, we can mistake an unfair coin for a fair one. According to this erroneous assumption, our uncertainty about the j-th event, or equivalently, the amount of information provided after observing the j-th event, is
The (assumed) average uncertainty of all possible events is then
On the other hand, the Shannon entropy of the probability distribution p, defined by
is the real amount of uncertainty before observation. Therefore the difference between these two quantities
is a measure of the distinguishability of the two probability distributions p and q. This is precisely the classical relative entropy, or Kullback–Leibler divergence
:
Note
. Let ρ be a density matrix. The von Neumann entropy
of ρ, which is the quantum mechanical analaog of the Shannon entropy, is given by
For two density matrices ρ and σ, the quantum relative entropy of ρ with respect to σ is defined by
We see that, when the states are classical, i.e. ρσ = σρ, the definition coincides with the classical case.
. When consider the quantum relative entropy, we assume the convention that - s· log 0 = ∞ for any s > 0. This leads to the definition that
when
This makes physical sense. Informally, the quantum relative entropy is a measure of our ability to distinguish two quantum states. But orthogonal quantum states can always be distinguished, via projective measurement. In the present context, this is reflected by non-finite quantum relative entropy.
In the interpretation given in the previous section, if we erroneously assume the state ρ has support in
supp(ρ)⊥, this is an error impossible to recover from.
and equality holds if and only if P = Q. Colloquially, this means that the uncertainty calculated using erroneous assumptions is always greater than the real amount of uncertainty.
To show the inequality, we rewrite
Notice that log is a concave function
. Therefore -log is convex
. Applying Jensen's inequality
to -log gives
Jensen's inequality also states that equality holds if and only if, for all i, qi = (∑qj) pi, i.e. p = q.
is non-negative in general. It is zero if and only ρ = σ.
Proof
Let ρ and σ have spectral decompositions
So
Direct calculation gives
where Pi j = |vi*wj|2.
Since the matrix (Pi j)i j is a doubly stochastic matrix
and -log is a convex function, the above expression is
Define ri = ∑jqj Pi j. Then {ri} is a probability distribution. From the non-negativity of classical relative entropy, we have
The second part of the claim follows from the fact that, since -log is strictly convex, equality is achieved in
if and only if (Pi j) is a permutation matrix
, which implies ρ = σ, after a suitable labeling of the eigenvectors {vi} and {wi}.
and ρ be a density matrix acting on H.
The relative entropy of entanglement of ρ is defined by
where the minimum is taken over the family of separable states. A physical interpretation of the quantity is the optimal distinguishability of the state ρ from separable states.
Clearly, when ρ is not entangled
by Klein's inequality.
Let ρAB be the joint state of a bipartite system with subsystem A of dimension nA and B of dimension nB. Let ρA, ρB be the respective reduced states, and IA, IB the respective identities. The maximally mixed states are IA/nA and IB/nB. Then it is possible to show with direct computation that
,
,
,
where I(A:B) is the quantum mutual information
and S(B|A) is the quantum conditional entropy.
Density matrix
In quantum mechanics, a density matrix is a self-adjoint positive-semidefinite matrix of trace one, that describes the statistical state of a quantum system...
. It is the quantum mechanical analog of relative entropy.
Motivation
For simplicity, it will be assumed that all objects in the article are finite dimensional.We first discuss the classical case. Suppose the probabilities of a finite sequence of events is given by the probability distribution P = {p1...pn}, but somehow we mistakenly assumed it to be Q = {q1...qn}. For instance, we can mistake an unfair coin for a fair one. According to this erroneous assumption, our uncertainty about the j-th event, or equivalently, the amount of information provided after observing the j-th event, is
The (assumed) average uncertainty of all possible events is then
On the other hand, the Shannon entropy of the probability distribution p, defined by
is the real amount of uncertainty before observation. Therefore the difference between these two quantities
is a measure of the distinguishability of the two probability distributions p and q. This is precisely the classical relative entropy, or Kullback–Leibler divergence
Kullback–Leibler divergence
In probability theory and information theory, the Kullback–Leibler divergence is a non-symmetric measure of the difference between two probability distributions P and Q...
:
Note
- In the definitions above, the convention that 0·log 0 = 0 is assumed, since limx → 0 x log x = 0. Intuitively, one would expect that an event of zero probability to contribute nothing towards entropy.
- The relative entropy is not a metricMetric spaceIn mathematics, a metric space is a set where a notion of distance between elements of the set is defined.The metric space which most closely corresponds to our intuitive understanding of space is the 3-dimensional Euclidean space...
. For example, it is not symmetric. The uncertainty discrepancy in mistaking a fair coin to be unfair is not the same as the opposite situation.
Definition
As with many other objects in quantum information theory, quantum relative entropy is defined by extending the classical definition from probability distributions to density matricesDensity matrix
In quantum mechanics, a density matrix is a self-adjoint positive-semidefinite matrix of trace one, that describes the statistical state of a quantum system...
. Let ρ be a density matrix. The von Neumann entropy
Von Neumann entropy
In quantum statistical mechanics, von Neumann entropy, named after John von Neumann, is the extension of classical entropy concepts to the field of quantum mechanics....
of ρ, which is the quantum mechanical analaog of the Shannon entropy, is given by
For two density matrices ρ and σ, the quantum relative entropy of ρ with respect to σ is defined by
We see that, when the states are classical, i.e. ρσ = σρ, the definition coincides with the classical case.
Non-finite relative entropy
In general, the support of a matrix M, denoted by supp(M), is the orthogonal complement of its kernelKernel (mathematics)
In mathematics, the word kernel has several meanings. Kernel may mean a subset associated with a mapping:* The kernel of a mapping is the set of elements that map to the zero element , as in kernel of a linear operator and kernel of a matrix...
. When consider the quantum relative entropy, we assume the convention that - s· log 0 = ∞ for any s > 0. This leads to the definition that
when
This makes physical sense. Informally, the quantum relative entropy is a measure of our ability to distinguish two quantum states. But orthogonal quantum states can always be distinguished, via projective measurement. In the present context, this is reflected by non-finite quantum relative entropy.
In the interpretation given in the previous section, if we erroneously assume the state ρ has support in
supp(ρ)⊥, this is an error impossible to recover from.
Corresponding classical statement
For the classical Kullback–Leibler divergence, it can be shown thatand equality holds if and only if P = Q. Colloquially, this means that the uncertainty calculated using erroneous assumptions is always greater than the real amount of uncertainty.
To show the inequality, we rewrite
Notice that log is a concave function
Concave function
In mathematics, a concave function is the negative of a convex function. A concave function is also synonymously called concave downwards, concave down, convex upwards, convex cap or upper convex.-Definition:...
. Therefore -log is convex
Convex function
In mathematics, a real-valued function f defined on an interval is called convex if the graph of the function lies below the line segment joining any two points of the graph. Equivalently, a function is convex if its epigraph is a convex set...
. Applying Jensen's inequality
Jensen's inequality
In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906. Given its generality, the inequality appears in many forms depending on the context,...
to -log gives
Jensen's inequality also states that equality holds if and only if, for all i, qi = (∑qj) pi, i.e. p = q.
The result
Klein's inequality states that the quantum relative entropyis non-negative in general. It is zero if and only ρ = σ.
Proof
Let ρ and σ have spectral decompositions
So
Direct calculation gives
where Pi j = |vi*wj|2.
Since the matrix (Pi j)i j is a doubly stochastic matrix
Doubly stochastic matrix
In mathematics, especially in probability and combinatorics, a doubly stochastic matrix,is a square matrix of nonnegative real numbers, each of whose rows and columns sums to 1...
and -log is a convex function, the above expression is
Define ri = ∑jqj Pi j. Then {ri} is a probability distribution. From the non-negativity of classical relative entropy, we have
The second part of the claim follows from the fact that, since -log is strictly convex, equality is achieved in
if and only if (Pi j) is a permutation matrix
Permutation matrix
In mathematics, in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry 1 in each row and each column and 0s elsewhere...
, which implies ρ = σ, after a suitable labeling of the eigenvectors {vi} and {wi}.
An entanglement measure
Let a composite quantum system have state spaceand ρ be a density matrix acting on H.
The relative entropy of entanglement of ρ is defined by
where the minimum is taken over the family of separable states. A physical interpretation of the quantity is the optimal distinguishability of the state ρ from separable states.
Clearly, when ρ is not entangled
Quantum entanglement
Quantum entanglement occurs when electrons, molecules even as large as "buckyballs", photons, etc., interact physically and then become separated; the type of interaction is such that each resulting member of a pair is properly described by the same quantum mechanical description , which is...
by Klein's inequality.
Relation to other quantum information quantities
One reason the quantum relative entropy is useful is that several other important quantum information quantities are special cases of it. Often, theorems are stated in terms of the quantum relative entropy, which lead to immediate corollaries concerning the other quantities. Below, we list some of these relations.Let ρAB be the joint state of a bipartite system with subsystem A of dimension nA and B of dimension nB. Let ρA, ρB be the respective reduced states, and IA, IB the respective identities. The maximally mixed states are IA/nA and IB/nB. Then it is possible to show with direct computation that
,
,
,
where I(A:B) is the quantum mutual information
Quantum mutual information
In quantum information theory, quantum mutual information, or von Neumann mutual information, after John von Neumann, is a measure of correlation between subsystems of quantum state...
and S(B|A) is the quantum conditional entropy.