Pseudo-random number sampling
Encyclopedia
Pseudo-random number sampling or non-uniform pseudo-random variate generation is the numerical
practice of generating pseudo-random numbers that are distributed according to a given probability distribution
.
Methods of sampling a non-uniform distribution
are typically based on the availability of a pseudo-random number generator producing numbers X that are uniformly distributed. Computational algorithms are then used to manipulate a single random variate
, X, or often several such variates, into a new random variate Y such that these values have the required distribution.
Historically, basic methods of pseudo-random number sampling were developed for Monte-Carlo simulations in the Manhattan project
; they were first published by John von Neumann
in the early 1950s.
f takes non-zero values, the basic sampling algorithm is straightforward. The interval[ 0, 1) is divided in n intervals [0, f(1)) , [f(1), f(1) + f(2)), ... The width of interval i equals the probability f(i).
One draws a uniformly distributed pseudo-random number X, and searches for the index i of the corresponding interval. The so determined i will have the distribution f(i).
Formalizing this idea becomes easier by using the cumulative distribution function
It is convenient to set F(0) = 0. The n intervals are then simply [F(0), F(1)), [F(1), F(2)), ..., [F(n − 1), F(n)). The main computational task is then to determine i for which F(i − 1) ≤ X < F(i).
This can be done by different algorithms:
For generating a normal distribution:
For generating a Poisson distribution
:
Numerical analysis
Numerical analysis is the study of algorithms that use numerical approximation for the problems of mathematical analysis ....
practice of generating pseudo-random numbers that are distributed according to a given probability distribution
Probability distribution
In probability theory, a probability mass, probability density, or probability distribution is a function that describes the probability of a random variable taking certain values....
.
Methods of sampling a non-uniform distribution
Uniform distribution (continuous)
In probability theory and statistics, the continuous uniform distribution or rectangular distribution is a family of probability distributions such that for each member of the family, all intervals of the same length on the distribution's support are equally probable. The support is defined by...
are typically based on the availability of a pseudo-random number generator producing numbers X that are uniformly distributed. Computational algorithms are then used to manipulate a single random variate
Random variate
A random variate is a particular outcome of a random variable: the random variates which are other outcomes of the same random variable would have different values. Random variates are used when simulating processes driven by random influences...
, X, or often several such variates, into a new random variate Y such that these values have the required distribution.
Historically, basic methods of pseudo-random number sampling were developed for Monte-Carlo simulations in the Manhattan project
Manhattan Project
The Manhattan Project was a research and development program, led by the United States with participation from the United Kingdom and Canada, that produced the first atomic bomb during World War II. From 1942 to 1946, the project was under the direction of Major General Leslie Groves of the US Army...
; they were first published by John von Neumann
John von Neumann
John von Neumann was a Hungarian-American mathematician and polymath who made major contributions to a vast number of fields, including set theory, functional analysis, quantum mechanics, ergodic theory, geometry, fluid dynamics, economics and game theory, computer science, numerical analysis,...
in the early 1950s.
Finite discrete distributions
For a discrete probability distribution with a finite number n of indices at which the probability mass functionProbability mass function
In probability theory and statistics, a probability mass function is a function that gives the probability that a discrete random variable is exactly equal to some value...
f takes non-zero values, the basic sampling algorithm is straightforward. The interval
One draws a uniformly distributed pseudo-random number X, and searches for the index i of the corresponding interval. The so determined i will have the distribution f(i).
Formalizing this idea becomes easier by using the cumulative distribution function
It is convenient to set F(0) = 0. The n intervals are then simply [F(0), F(1)), [F(1), F(2)), ..., [F(n − 1), F(n)). The main computational task is then to determine i for which F(i − 1) ≤ X < F(i).
This can be done by different algorithms:
- Linear searchLinear searchIn computer science, linear search or sequential search is a method for finding a particular value in a list, that consists of checking every one of its elements, one at a time and in sequence, until the desired one is found....
, computational time linear in n. - Binary search, computational time goes with log n.
- Indexed searchIndexed searchIndexed search, also called the cutpoint method, is an algorithm for discrete-distribution pseudo-random number sampling, invented by Chen and Asau in 1974.- Literature :...
, also called the cutpoint method. - Alias method, computational time is constant, using some pre-computed tables.
- There are other methods that cost constant time.
Continuous distributions
Generic methods:- Rejection samplingRejection samplingIn mathematics, rejection sampling is a basic pseudo-random number sampling technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm"....
- Inverse transform sampling
- Slice samplingSlice samplingSlice sampling is a type of Markov chain Monte Carlo algorithm for pseudo-random number sampling, i.e. for drawing random samples from a statistical distribution...
- Ziggurat algorithmZiggurat algorithmThe ziggurat algorithm is an algorithm for pseudo-random number sampling. Belonging to the class of rejection sampling algorithms, it relies on an underlying source of uniformly-distributed random numbers, typically from a pseudo-random number generator, as well as precomputed tables. The...
, for monotonously decreasing density functions - Convolution random number generator, not a sampling method in itself: it describes the use of arithmetics on top of one ore more existing sampling methods to generate more involved distributions.
For generating a normal distribution:
- Box–Muller transform
- Marsaglia polar methodMarsaglia polar methodThe polar method is a pseudo-random number sampling method for generating a pair of independent standard normal random variables...
For generating a Poisson distribution
Poisson distribution
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since...
:
- See Poisson distribution#Generating Poisson-distributed random variables
Literature
- Devroye, L. (1986) Non-Uniform Random Variate Generation. New York: Springer
- Fishman, G.S. (1996) Monte Carlo. Concepts, Algorithms, and Applications. New York: Springer
- Hörmann, W.; J Leydold, G Derflinger (2004)Automatic Nonuniform Random Variate Generation. Berlin: Springer.
- Knuth, D.E.Donald KnuthDonald Ervin Knuth is a computer scientist and Professor Emeritus at Stanford University.He is the author of the seminal multi-volume work The Art of Computer Programming. Knuth has been called the "father" of the analysis of algorithms...
(1997) The Art of Computer ProgrammingThe Art of Computer ProgrammingThe Art of Computer Programming is a comprehensive monograph written by Donald Knuth that covers many kinds of programming algorithms and their analysis....
, Vol. 2 Seminumerical Algorithms, Chapter 3.4.1 (3rd edition). - Ripley, B.D. (1987) Stochastic Simulation. Wiley.