Earth Mover's Distance
Encyclopedia
In computer science
, the earth mover's distance (EMD) is a measure of the distance
between two probability distribution
s over a region D. In mathematics, this is known as the Wasserstein metric
. Informally, if the distributions are interpreted as two different ways of piling up a certain amount of dirt over the region D, the EMD is the minimum cost of turning one pile into the other; where the cost is assumed to be amount of dirt moved times the distance
by which it is moved .
The above definition is valid only if the two distributions have the same integral (informally, if the two piles have the same amount of dirt), as in normalized histogram
s or probability density function
s. In that case, the EMD is equivalent to the 1st Mallows distance or 1st Wasserstein distance
between the two distributions .
. In particular, if D is a one-dimensional array of "bins" the EMD can be efficiently computed by scanning the array and keeping track of how much dirt needs to be transported between consecutive bins.
images that may differ due to dithering, blurring, or local deformations . In this case, the region is the image's domain, and the total amount of light (or ink) is the "dirt" to be rearranged.
The EMD is widely used in content-based image retrieval
to compute distances between the color histogram
s of two digital image
s. In this case, the region is the RGB color cube
, and each image pixel is a parcel of "dirt". The same technique can be used for any other quantitative pixel
attribute, such as luminance
, gradient
, apparent motion
in a video frame, etc..
More generally, the EMD is used in pattern recognition
to compare generic summaries or surrogates of data records called signatures. A typical signature consists of list of pairs ((x1,m1), … (xn,mn)), where each xi is a certain "feature" (e.g., color in an image, letter in a text, etc.), and mi is "mass" (how many times that feature occurs the record). Alternatively, xi may be the centroid
of a data cluster, and mi the number of entities in that cluster. To compare two such signatures with the EMD, one must define a distance between features, which is interpreted as the cost of turning a unit mass of one feature into a unit mass of the other. The EMD between two signatures is then the minimum cost of turning one of them into the other.
in 1781 . The use of the EMD as a distance measure for monochromatic images was described in 1989 by S. Peleg, M. Werman and H. Rom . The name "earth movers' distance" was proposed by J. Stolfi
in 1994 , and was used in print in 1998 by Y. Rubner, C. Tomasi and L. G. Guibas
.
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...
, the earth mover's distance (EMD) is a measure of the distance
Metric (mathematics)
In mathematics, a metric or distance function is a function which defines a distance between elements of a set. A set with a metric is called a metric space. A metric induces a topology on a set but not all topologies can be generated by a metric...
between two probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....
s over a region D. In mathematics, this is known as the Wasserstein metric
Wasserstein metric
In mathematics, the Wasserstein metric is a distance function defined between probability distributions on a given metric space M....
. Informally, if the distributions are interpreted as two different ways of piling up a certain amount of dirt over the region D, the EMD is the minimum cost of turning one pile into the other; where the cost is assumed to be amount of dirt moved times the distance
Distance
Distance is a numerical description of how far apart objects are. In physics or everyday discussion, distance may refer to a physical length, or an estimation based on other criteria . In mathematics, a distance function or metric is a generalization of the concept of physical distance...
by which it is moved .
The above definition is valid only if the two distributions have the same integral (informally, if the two piles have the same amount of dirt), as in normalized histogram
Histogram
In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...
s or probability density function
Probability density function
In probability theory, a probability density function , or density of a continuous random variable is a function that describes the relative likelihood for this random variable to occur at a given point. The probability for the random variable to fall within a particular region is given by the...
s. In that case, the EMD is equivalent to the 1st Mallows distance or 1st Wasserstein distance
Wasserstein metric
In mathematics, the Wasserstein metric is a distance function defined between probability distributions on a given metric space M....
between the two distributions .
Extensions
Some applications may require the comparison of distributions with different total masses. One approach is to allow for a partial match, where dirt from the most massive distribution is rearranged to make the least massive, and any leftover "dirt" is discarded at no cost. Under this approach, the EMD is no longer a true distance between distributions. Another approach is to allow for mass to be created or destroyed, on a global and/or local level, as an alternative to transportation, but with a cost penalty. In that case one must specify a real parameter σ, the ratio between the cost of creating or destroying one unit of "dirt", and the cost of transporting it by a unit distance. This is equivalent to minimizing the sum of the earth moving cost plus σ times the L1 distance between the rearranged pile and the second distribution.Computing the EMD
If the domain D is discrete, the EMD can be computed by solving an instance transportation problem, which can be solved by the so-called Hungarian algorithmHungarian algorithm
The Hungarian method is a combinatorial optimization algorithm which solves the assignment problem in polynomial time and which anticipated later primal-dual methods...
. In particular, if D is a one-dimensional array of "bins" the EMD can be efficiently computed by scanning the array and keeping track of how much dirt needs to be transported between consecutive bins.
Applications
An early application of the EMD in computer science was to compare two grayscaleGrayscale
In photography and computing, a grayscale or greyscale digital image is an image in which the value of each pixel is a single sample, that is, it carries only intensity information...
images that may differ due to dithering, blurring, or local deformations . In this case, the region is the image's domain, and the total amount of light (or ink) is the "dirt" to be rearranged.
The EMD is widely used in content-based image retrieval
Content-based image retrieval
Content-based image retrieval , also known as query by image content and content-based visual information retrieval is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases....
to compute distances between the color histogram
Color histogram
In image processing and photography, a color histogram is a representation of the distribution of colors in an image. For digital images, a color histogram represents the number of pixels that have colors in each of a fixed list of color ranges, that span the image's color space, the set of all...
s of two digital image
Digital image
A digital image is a numeric representation of a two-dimensional image. Depending on whether or not the image resolution is fixed, it may be of vector or raster type...
s. In this case, the region is the RGB color cube
RGB color model
The RGB color model is an additive color model in which red, green, and blue light is added together in various ways to reproduce a broad array of colors...
, and each image pixel is a parcel of "dirt". The same technique can be used for any other quantitative pixel
Pixel
In digital imaging, a pixel, or pel, is a single point in a raster image, or the smallest addressable screen element in a display device; it is the smallest unit of picture that can be represented or controlled....
attribute, such as luminance
Luminance
Luminance is a photometric measure of the luminous intensity per unit area of light travelling in a given direction. It describes the amount of light that passes through or is emitted from a particular area, and falls within a given solid angle. The SI unit for luminance is candela per square...
, gradient
Image gradient
An image gradient is a directional change in the intensity or color in an image. Image gradients may be used to extract information from images....
, apparent motion
Optical flow
Optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer and the scene. The concept of optical flow was first studied in the 1940s and ultimately published by American psychologist James J....
in a video frame, etc..
More generally, the EMD is used in pattern recognition
Pattern recognition
In machine learning, pattern recognition is the assignment of some sort of output value to a given input value , according to some specific algorithm. An example of pattern recognition is classification, which attempts to assign each input value to one of a given set of classes...
to compare generic summaries or surrogates of data records called signatures. A typical signature consists of list of pairs ((x1,m1), … (xn,mn)), where each xi is a certain "feature" (e.g., color in an image, letter in a text, etc.), and mi is "mass" (how many times that feature occurs the record). Alternatively, xi may be the centroid
Centroid
In geometry, the centroid, geometric center, or barycenter of a plane figure or two-dimensional shape X is the intersection of all straight lines that divide X into two parts of equal moment about the line. Informally, it is the "average" of all points of X...
of a data cluster, and mi the number of entities in that cluster. To compare two such signatures with the EMD, one must define a distance between features, which is interpreted as the cost of turning a unit mass of one feature into a unit mass of the other. The EMD between two signatures is then the minimum cost of turning one of them into the other.
History
The concept was first introduced by Gaspard MongeGaspard Monge
Gaspard Monge, Comte de Péluse was a French mathematician, revolutionary, and was inventor of descriptive geometry. During the French Revolution, he was involved in the complete reorganization of the educational system, founding the École Polytechnique...
in 1781 . The use of the EMD as a distance measure for monochromatic images was described in 1989 by S. Peleg, M. Werman and H. Rom . The name "earth movers' distance" was proposed by J. Stolfi
Jorge Stolfi
Jorge Stolfi is a full professor of computer science at the State University of Campinas, working in computer vision, image processing, splines and other function approximation methods, graph theory, computational geometry, and several other fields...
in 1994 , and was used in print in 1998 by Y. Rubner, C. Tomasi and L. G. Guibas
Leonidas J. Guibas
Leonidas John Guibas is a professor of computer science at Stanford University, where he heads the geometric computation group and is a member of the computer graphics and artificial intelligence laboratories. Guibas was a student of Donald Knuth at Stanford, where he received his Ph.D. in 1976...
.