Independent and identically distributed random variables
Encyclopedia
In probability theory
and statistics
, a sequence
or other collection of random variable
s is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution
as the others and all are mutually independent.
The abbreviation
i.i.d. is particularly common in statistics
(often as iid, sometimes written IID), where observations in a sample are often assumed to be (more-or-less) i.i.d. for the purposes of statistical inference
. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods: see mathematical statistics
and statistical theory
. However, in practical applications of statistical modeling the assumption may or may not be realistic.
The generalization of exchangeable random variables is often sufficient and more easily met.
The assumption is important in the classical form of the central limit theorem
, which states that the probability distribution of the sum (or average) of i.i.d. variables with finite variance
approaches a normal distribution.
. Exchangeability means that while variables may not be independent or identically distributed, future ones behave like past ones – formally, any value of a finite sequence is as likely as any permutation
of those values – the joint probability distribution is invariant under the symmetric group
.
This provides a useful generalization – for example, sampling without replacement is not independent, but is exchangeable – and is widely used in Bayesian statistics
.
, i.i.d. variables are thought of as a discrete time
Lévy process
: each variable gives how much one changes from one time to another.
For example, a sequence of Bernoulli trials is interpreted as the Bernoulli process
.
One may generalize this to include continuous time Lévy processes, and many Lévy processes can be seen as limits of i.i.d. variables – for instance, the Wiener process
is the limit of the Bernoulli process.
Probability theory
Probability theory is the branch of mathematics concerned with analysis of random phenomena. The central objects of probability theory are random variables, stochastic processes, and events: mathematical abstractions of non-deterministic events or measured quantities that may either be single...
and statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
, a sequence
Sequence
In mathematics, a sequence is an ordered list of objects . Like a set, it contains members , and the number of terms is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence...
or other collection of random variable
Random variable
In probability and statistics, a random variable or stochastic variable is, roughly speaking, a variable whose value results from a measurement on some type of random process. Formally, it is a function from a probability space, typically to the real numbers, which is measurable functionmeasurable...
s is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....
as the others and all are mutually independent.
The abbreviation
Abbreviation
An abbreviation is a shortened form of a word or phrase. Usually, but not always, it consists of a letter or group of letters taken from the word or phrase...
i.i.d. is particularly common in statistics
Statistics
Statistics is the study of the collection, organization, analysis, and interpretation of data. It deals with all aspects of this, including the planning of data collection in terms of the design of surveys and experiments....
(often as iid, sometimes written IID), where observations in a sample are often assumed to be (more-or-less) i.i.d. for the purposes of statistical inference
Statistical inference
In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation...
. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods: see mathematical statistics
Mathematical statistics
Mathematical statistics is the study of statistics from a mathematical standpoint, using probability theory as well as other branches of mathematics such as linear algebra and analysis...
and statistical theory
Statistical theory
The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that...
. However, in practical applications of statistical modeling the assumption may or may not be realistic.
The generalization of exchangeable random variables is often sufficient and more easily met.
The assumption is important in the classical form of the central limit theorem
Central limit theorem
In probability theory, the central limit theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed. The central limit theorem has a number of variants. In its common...
, which states that the probability distribution of the sum (or average) of i.i.d. variables with finite variance
Variance
In probability theory and statistics, the variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean . In particular, the variance is one of the moments of a distribution...
approaches a normal distribution.
Uses in modeling
The following are examples or applications of independent and identically distributed (i.i.d.) random variables:- A sequence of outcomes of spins of a fair rouletteRouletteRoulette is a casino game named after a French diminutive for little wheel. In the game, players may choose to place bets on either a single number or a range of numbers, the colors red or black, or whether the number is odd or even....
wheel is i.i.d. This means that if the roulette ball lands on "red", for example, 20 times in a row, the next spin is no more or less likely to be "black" than on any other spin (see the Gambler's fallacyGambler's fallacyThe Gambler's fallacy, also known as the Monte Carlo fallacy , and also referred to as the fallacy of the maturity of chances, is the belief that if deviations from expected behaviour are observed in repeated independent trials of some random process, future deviations in the opposite direction are...
). - A sequence of fair dice rolls is i.i.d.
- A sequence of fair coin flips is i.i.d.
- In signal processingSignal processingSignal processing is an area of systems engineering, electrical engineering and applied mathematics that deals with operations on or analysis of signals, in either discrete or continuous time...
and image processingImage processingIn electrical engineering and computer science, image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image...
the notion of transformation to IID implies two specifications, the "ID" (ID = identically distributed) part and the "I" (I = independent) part:- (ID) the signal level must be balanced on the time axis;
- (I) the signal spectrum must be flattened, i.e. transformed by filtering (such as deconvolutionDeconvolutionIn mathematics, deconvolution is an algorithm-based process used to reverse the effects of convolution on recorded data. The concept of deconvolution is widely used in the techniques of signal processing and image processing...
) to a whiteWhite noiseWhite noise is a random signal with a flat power spectral density. In other words, the signal contains equal power within a fixed bandwidth at any center frequency...
signal (one where all frequencies are equally present).
Uses in inference
- One of the simplest statistical tests, the z-testZ-testA Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Due to the central limit theorem, many test statistics are approximately normally distributed for large samples...
, is used to test hypothesesNull hypothesisThe practice of science involves formulating and testing hypotheses, assertions that are capable of being proven false using a test of observed data. The null hypothesis typically corresponds to a general or default position...
about meanMeanIn statistics, mean has two related meanings:* the arithmetic mean .* the expected value of a random variable, which is also called the population mean....
s of random variables. When using the z-test, one assumes (requires) that all observations are i.i.d. in order to satisfy the conditions of the central limit theoremCentral limit theoremIn probability theory, the central limit theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed. The central limit theorem has a number of variants. In its common...
.
Generalizations
Many results that are initially stated for i.i.d. variables are true more generally.Exchangeable random variables
The most general notion which shares the main properties of i.i.d. variables are exchangeable random variables, introduced by Bruno de FinettiBruno de Finetti
Bruno de Finetti was an Italian probabilist, statistician and actuary, noted for the "operational subjective" conception of probability...
. Exchangeability means that while variables may not be independent or identically distributed, future ones behave like past ones – formally, any value of a finite sequence is as likely as any permutation
Permutation
In mathematics, the notion of permutation is used with several slightly different meanings, all related to the act of permuting objects or values. Informally, a permutation of a set of objects is an arrangement of those objects into a particular order...
of those values – the joint probability distribution is invariant under the symmetric group
Symmetric group
In mathematics, the symmetric group Sn on a finite set of n symbols is the group whose elements are all the permutations of the n symbols, and whose group operation is the composition of such permutations, which are treated as bijective functions from the set of symbols to itself...
.
This provides a useful generalization – for example, sampling without replacement is not independent, but is exchangeable – and is widely used in Bayesian statistics
Bayesian statistics
Bayesian statistics is that subset of the entire field of statistics in which the evidence about the true state of the world is expressed in terms of degrees of belief or, more specifically, Bayesian probabilities...
.
Lévy process
In stochastic calculusStochastic calculus
Stochastic calculus is a branch of mathematics that operates on stochastic processes. It allows a consistent theory of integration to be defined for integrals of stochastic processes with respect to stochastic processes...
, i.i.d. variables are thought of as a discrete time
Discrete time
Discrete time is the discontinuity of a function's time domain that results from sampling a variable at a finite interval. For example, consider a newspaper that reports the price of crude oil once every day at 6:00AM. The newspaper is described as sampling the cost at a frequency of once per 24...
Lévy process
Lévy process
In probability theory, a Lévy process, named after the French mathematician Paul Lévy, is any continuous-time stochastic process that starts at 0, admits càdlàg modification and has "stationary independent increments" — this phrase will be explained below...
: each variable gives how much one changes from one time to another.
For example, a sequence of Bernoulli trials is interpreted as the Bernoulli process
Bernoulli process
In probability and statistics, a Bernoulli process is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The component Bernoulli variables Xi are identical and independent...
.
One may generalize this to include continuous time Lévy processes, and many Lévy processes can be seen as limits of i.i.d. variables – for instance, the Wiener process
Wiener process
In mathematics, the Wiener process is a continuous-time stochastic process named in honor of Norbert Wiener. It is often called standard Brownian motion, after Robert Brown...
is the limit of the Bernoulli process.
See also
- De Finetti's theoremDe Finetti's theoremIn probability theory, de Finetti's theorem explains why exchangeable observations are conditionally independent given some latent variable to which an epistemic probability distribution would then be assigned...
– exchangeability - Exchangeable random variables
- Lévy processLévy processIn probability theory, a Lévy process, named after the French mathematician Paul Lévy, is any continuous-time stochastic process that starts at 0, admits càdlàg modification and has "stationary independent increments" — this phrase will be explained below...