Pseudorandom generator theorem
Encyclopedia
In computational complexity theory
and cryptography
, the existence of pseudorandom generator
s is related to the existence of one-way function
s through a number of theorems, collectively referred to as the pseudorandom generator theorem.
by a non-negligible advantage
. Formally, a family of distributions Dn is pseudorandom if for any polynomial size circuit C, and any ε inversely polynomial in n
The idea of the proof is as follows: first s bits from uniform distribution Ul are picked and used as the seed to the first instance of Gl, which is known to be a pseudorandom generator. Next, the output of the first instance of Gl is divided into two parts: first l bits are fed into the second instance of Gl as a seed, while the last bit becomes the first bit of the output. Repeating this process for m times yields an output of m pseudorandom bits.
It can be shown that such G'l, that consists of m instances of Gl, is indeed a pseudorandom generator by using a hybrid approach and proof by contradiction as follows:
Let's consider m+1 intermediate distributions Hi: 0 ≤ i ≤ m, where first i bits are chosen from the uniform distribution, and last m − i bits are chosen from the output of G'l. Thus, H0 is the full output of G'l and Hm is a true uniform distribution Um. Therefore, distributions Hi and Hi+1 will be different in only one bit (bit number i+1).
Now, assume that G'l is not a pseudorandom distribution; that is, there exists some circuit C that can distinguish between G'l and Um with an advantage ε = 1/poly(l). In other words, this circuit can distinguish between H0 and Hm. Therefore, there exists such i that the circuit C can distinguish between Hi and Hi+1 by at least ε / m. Note, that since m are polynomial in l, then ε / m is also polynomial in l and is still a non-negligible advantage.
Now, assume we are given l+1 bits that are either output of Gl or a drawn from uniform distribution. Let's reuse the approach of building large pseudorandom generators out of instances of Gl and construct a string of pseudorandom bits of length m−i−1 in the same way the G'l was constructed above using the first l given bits as the seed. Then, let's create a string consisting of i bits drawn from uniform, concatenated with the last one of the given bits, followed by the created m−i−1 bits. The resulting output is either Hi or Hi+1, since the i+1 bit is either drawn from uniform or from Gl. Since by assumption we can distinguish between Hi and Hi+1 with non-negligible advantage, then we can distinguish between U and Gl, which implies that Gl is not a pseudorandom generator, which is a contradiciton to the hypothesis. Q.E.D.
Now, let's illustrate that if exists, the circuit for distinguishing between Gl and Ul+1 does not have to toss random coins. As we showed above, if exists a circuit C for distinguishing between G'l and Um, where m = poly(l), then exists a circuit C' for distinguishing between Gl and Ul+1 that uses i random bits. For this circuit C' :
| Probu, s [C' (u1,...,ui,Gl(s)) = 1 ] − Probu, y [C' (u1,>,...,ui,y) = 1] | ≥ ε / m,
where u is a string of i uniformly random bits, s is a string of l uniformly random bits, and y is a string of l+1 uniformly random bits.
Then,
Probu[ | Probs [C' (u1,...,ui,Gl(s)) = 1] - Proby [C' (u1,...,ui,y) = 1] | ] ≥ ε / m;
Which means, there exists some fixed string u of i bits that can be used as an "advice" to circuit C' for distinguishing between Gl and Ul+1.
s and hard-core predicate
s. Formally,
pseudorandom generators exist if and only if one-way functions exist, or
PRG ↔ OWF
s are functions that are easy to compute and hard to invert. In other words the complexity (or circuit size) of the function is much smaller than that of its inverse. Formally: A function ƒ: {0,1}n → {0,1}n is (S,ε) one-way if for any circuit C of size ≤ S,
Prob[ƒ(C(ƒ(x))) = ƒ(x)] ≤ ε
.
Moreover, ƒ is a one-way function if
for function ƒ if
In other words it is hard to predict B(x) from function ƒ(x).
ƒ(x,y) → Gl(x)
A key observation that justifies such selection, is that the image
of the function is of size 2n and is a negligible fraction of the pre-image universe of size 22n.
To prove that ƒ is indeed a one-way function let's construct an argument by contradiction. Assume there exists a circuit C that inverts ƒ with advantage ε:
Prob[ƒ(C(ƒ(x,y))) = ƒ(x,y)] > ε
Then we can create the following algorithm that will distinguish Gl from uniform, which contradicts the hypothesis. The algorithm would take an input of 2n bits z and compute (x,y) = C(z). If Gl(x) = z the algorithm would accept, otherwise it rejects.
Now, if z is drawn from uniform distribution, the probability that the above algorithm accepts is ≤ 1/2l, since the size of image is 1/2l of the size of the pre-image. However, if z was drawn from the output of Gl then the probability of acceptance is > ε by assumption of the existence of circuit C. Therefore, the advantage that circuit C has in distinguishing between the uniform U and output of Gl is > ε − 1/2, which is non-negligible and thus contradicts our assumption of Gl being a pseudorandom generator. Q.E.D.
One-way permutation
→ pseudorandom generator
A one-way permutation is a one-way function that is also a permutation of the input bits. A pseudorandom generator can be constructed from one-way permutation ƒ as follows:
Gl: {0,1}l→{0,1}l+1 = ƒ(x).B(x), where B is hard-core predicate of ƒ and "." is a concatenation operator. Note, that by the theorem proven above, it is only needed to show the existence of a generator that adds just one pseudorandom bit.
First, let's show that if B is a hard-core predicate for ƒ then Gl is indeed pseudorandom. Again, we'll use an argument by contradiction.
Assume that Gl is not a pseudorandom generator; that is, there exists circuit C of polynomial size that distinguishes Gl(x) =ƒ(x).B(x) from Ul+1 with advantage ≥ε, where ε is non-negligible. Note, that since ƒ(x) is a permutation, then if x is drawn from uniform distribution, then so if ƒ(x). Therefore, Ul+1 is equivalent to ƒ(x).b, where b is a bit drawn independently from a uniform distribution. Formally,
Probx~U [C(G(x))=1] − Probx~U,b~U [C(x.b)=1] ≥ ε
Let's construct the following algorithm C:
1. Given z=f(x) guess bit b
2. Run C on z.b
3. IF C(z.b)=1
4. output b
5. ELSE
6. output 1-b
Given the output of ƒ the algorithm first guesses bit b by tossing a random coin, i.e. Prob[b=0] = Prob[b=1] = 0.5. Then, algorithm (circuit) C is run on f(x).b and if the result is 1 then b is outputted, otherwise the inverse of b is returned.
Then probability of C guessing B(x) correctly is:
Probx~U [C(z)=B(x)] =
Prob[b=B(x) ∧ C(z.b)=1] +
Prob[b≠B(x) ∧ C(z.b)=0] =
Prob[b=B(x)]⋅Prob[C(z.b)=1 | b=B(x)] +
Prob[b≠B(x)]⋅Prob[C(z.b)=0 | b≠B(x)] =
1/2⋅Prob[C(z.b)=1 | b=B(x)] +
1/2⋅Prob[C(z.b)=0 | b≠B(x)] =
(1−1/2)⋅Prob[C(z.b)=1 | b=B(x)] +
1/2⋅(1−Prob[C(z.b)=1 | b≠B(x)]) =
1/2+Probz.b~G(x) [C(z.b)=1] −
1/2⋅(Prob[C(z.b)=1 | b=B(x)]+Prob[C(z.b)=1 | b≠B(x)]) =
1/2+Probz.b~G(x) [C(z.b)=1] −
Probz.b~U [C(z.b)=1] ≥ 1/2+ε
This implies that circuit C can predict B(x) with probability more than 1/2 + ε, which means that B cannot be a hard-core predicate for ƒ and the hypothesis is contradicted. Q.E.D.
The outline of the proof is as follows:
If ƒ{0,1}n→{0,1}n is a one-way permutation, then so is ƒ'{0,1}2n→{0,1}2n, where ƒ'(x,y)=ƒ(x).y by definition. Then B(x,y)=x⋅y is a hard-core predicate for ƒ', where ⋅ is a vector dot product
. To prove that it is indeed hard-core let's assume otherwise, and show a contradiction with the hypothesis of ƒ being one-way. If B is not a hard-core predicate, then there exists a circuit C that predicts it, so
Probx,y[C(ƒ(x),y)=x⋅y] ≥
1/2+ε. That fact can be used to recover x by cleverly constructing permutations y that isolate bits in x. In fact, for a constant fraction of x, there exists a polynomial time algorithm that lists O(1/&epsilon2) candidates that include all valid x. Thus, an algorithm can invert ƒ(x) in polynomial time for a non-negligible fraction of x, which contradicts the hypothesis.
Computational complexity theory
Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...
and cryptography
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
, the existence of pseudorandom generator
Pseudorandom generator
In theoretical computer science, a pseudorandom generator is a deterministic procedure that produces a pseudorandom distribution from a short uniform input, known as a random seed.-Definition:...
s is related to the existence of one-way function
One-way function
In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems...
s through a number of theorems, collectively referred to as the pseudorandom generator theorem.
Pseudorandomness
A distribution is considered pseudorandom if no efficient computation can distinguish it from the true uniform distributionUniform distribution
-Probability theory:* Discrete uniform distribution* Continuous uniform distribution-Other:* "Uniform distribution modulo 1", see Equidistributed sequence*Uniform distribution , a type of species distribution* Distribution of military uniforms...
by a non-negligible advantage
Advantage (cryptography)
In cryptography, an adversary's advantage is a measure of how successfully it can attack a cryptographic algorithm, by distinguishing it from an idealized version of that type of algorithm. Note that in this context, the "adversary" is itself an algorithm and not a person...
. Formally, a family of distributions Dn is pseudorandom if for any polynomial size circuit C, and any ε inversely polynomial in n
- |Probx∈U [C(x)=1] − Probx∈D [C(x)=1] | ≤ ε.
Pseudorandom generators
A function Gl: {0,1}l → {0,1}m, where l < m is a pseudorandom generator if:- Gl can be computed in time polynomial in l
- Gl(x) is pseudorandom, when x is uniformly random.
One additional pseudorandom bit implies polynomially more pseudorandom bits
It can be shown that if there is a pseudorandom generator Gl: {0,1}l → {0,1}l+1, i.e. a generator that adds only one pseudorandom bit, then for any m = poly(l), there is a pseudorandom generator G'l: {0,1}l → {0,1}m.The idea of the proof is as follows: first s bits from uniform distribution Ul are picked and used as the seed to the first instance of Gl, which is known to be a pseudorandom generator. Next, the output of the first instance of Gl is divided into two parts: first l bits are fed into the second instance of Gl as a seed, while the last bit becomes the first bit of the output. Repeating this process for m times yields an output of m pseudorandom bits.
It can be shown that such G'l, that consists of m instances of Gl, is indeed a pseudorandom generator by using a hybrid approach and proof by contradiction as follows:
Let's consider m+1 intermediate distributions Hi: 0 ≤ i ≤ m, where first i bits are chosen from the uniform distribution, and last m − i bits are chosen from the output of G'l. Thus, H0 is the full output of G'l and Hm is a true uniform distribution Um. Therefore, distributions Hi and Hi+1 will be different in only one bit (bit number i+1).
Now, assume that G'l is not a pseudorandom distribution; that is, there exists some circuit C that can distinguish between G'l and Um with an advantage ε = 1/poly(l). In other words, this circuit can distinguish between H0 and Hm. Therefore, there exists such i that the circuit C can distinguish between Hi and Hi+1 by at least ε / m. Note, that since m are polynomial in l, then ε / m is also polynomial in l and is still a non-negligible advantage.
Now, assume we are given l+1 bits that are either output of Gl or a drawn from uniform distribution. Let's reuse the approach of building large pseudorandom generators out of instances of Gl and construct a string of pseudorandom bits of length m−i−1 in the same way the G'l was constructed above using the first l given bits as the seed. Then, let's create a string consisting of i bits drawn from uniform, concatenated with the last one of the given bits, followed by the created m−i−1 bits. The resulting output is either Hi or Hi+1, since the i+1 bit is either drawn from uniform or from Gl. Since by assumption we can distinguish between Hi and Hi+1 with non-negligible advantage, then we can distinguish between U and Gl, which implies that Gl is not a pseudorandom generator, which is a contradiciton to the hypothesis. Q.E.D.
Now, let's illustrate that if exists, the circuit for distinguishing between Gl and Ul+1 does not have to toss random coins. As we showed above, if exists a circuit C for distinguishing between G'l and Um, where m = poly(l), then exists a circuit C' for distinguishing between Gl and Ul+1 that uses i random bits. For this circuit C' :
| Probu, s [C' (u1,...,ui,Gl(s)) = 1 ] − Probu, y [C' (u1,>,...,ui,y) = 1] | ≥ ε / m,
where u is a string of i uniformly random bits, s is a string of l uniformly random bits, and y is a string of l+1 uniformly random bits.
Then,
Probu[ | Probs [C' (u1,...,ui,Gl(s)) = 1] - Proby [C' (u1,...,ui,y) = 1] | ] ≥ ε / m;
Which means, there exists some fixed string u of i bits that can be used as an "advice" to circuit C' for distinguishing between Gl and Ul+1.
Existence of pseudorandom generators
The existence of pseudorandom generators is related to the existence of one-way functionOne-way function
In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems...
s and hard-core predicate
Hard-core predicate
In cryptography, a hard-core predicate of a one-way function f is a predicate b which is easy to compute given x but is hard to compute given f...
s. Formally,
pseudorandom generators exist if and only if one-way functions exist, or
PRG ↔ OWF
One-way functions
Intuitively one-way functionOne-way function
In computer science, a one-way function is a function that is easy to compute on every input, but hard to invert given the image of a random input. Here "easy" and "hard" are to be understood in the sense of computational complexity theory, specifically the theory of polynomial time problems...
s are functions that are easy to compute and hard to invert. In other words the complexity (or circuit size) of the function is much smaller than that of its inverse. Formally: A function ƒ: {0,1}n → {0,1}n is (S,ε) one-way if for any circuit C of size ≤ S,
Prob[ƒ(C(ƒ(x))) = ƒ(x)] ≤ ε
.
Moreover, ƒ is a one-way function if
- ƒ can be computed in polynomial time
- ƒ is (poly(n), 1/poly(n)) one-way
Hard-core predicate
Function B: {0,1}n → {0,1} is a hard-core predicateHard-core predicate
In cryptography, a hard-core predicate of a one-way function f is a predicate b which is easy to compute given x but is hard to compute given f...
for function ƒ if
- B can be computed in polynomial time
- for any polynomial size circuit C and any non-negligible ε = 1/poly(n), Probx~U [C(ƒ(x)) = B(x)] ≤ 1/2+ε
In other words it is hard to predict B(x) from function ƒ(x).
PRG → OWF
Consider a pseudorandom generator Gl: {0,1}l → {0,1}2l. Let's create the following one-way function ƒ: {0,1}n → {0,1}n that uses the first half of the output of Gl as its output. Formally,ƒ(x,y) → Gl(x)
A key observation that justifies such selection, is that the image
Image (mathematics)
In mathematics, an image is the subset of a function's codomain which is the output of the function on a subset of its domain. Precisely, evaluating the function at each element of a subset X of the domain produces a set called the image of X under or through the function...
of the function is of size 2n and is a negligible fraction of the pre-image universe of size 22n.
To prove that ƒ is indeed a one-way function let's construct an argument by contradiction. Assume there exists a circuit C that inverts ƒ with advantage ε:
Prob[ƒ(C(ƒ(x,y))) = ƒ(x,y)] > ε
Then we can create the following algorithm that will distinguish Gl from uniform, which contradicts the hypothesis. The algorithm would take an input of 2n bits z and compute (x,y) = C(z). If Gl(x) = z the algorithm would accept, otherwise it rejects.
Now, if z is drawn from uniform distribution, the probability that the above algorithm accepts is ≤ 1/2l, since the size of image is 1/2l of the size of the pre-image. However, if z was drawn from the output of Gl then the probability of acceptance is > ε by assumption of the existence of circuit C. Therefore, the advantage that circuit C has in distinguishing between the uniform U and output of Gl is > ε − 1/2, which is non-negligible and thus contradicts our assumption of Gl being a pseudorandom generator. Q.E.D.
OWF → PRG
For this case we prove a weaker version of the theorem:One-way permutation
Permutation
In mathematics, the notion of permutation is used with several slightly different meanings, all related to the act of permuting objects or values. Informally, a permutation of a set of objects is an arrangement of those objects into a particular order...
→ pseudorandom generator
A one-way permutation is a one-way function that is also a permutation of the input bits. A pseudorandom generator can be constructed from one-way permutation ƒ as follows:
Gl: {0,1}l→{0,1}l+1 = ƒ(x).B(x), where B is hard-core predicate of ƒ and "." is a concatenation operator. Note, that by the theorem proven above, it is only needed to show the existence of a generator that adds just one pseudorandom bit.
Hard-core predicate → PRG
First, let's show that if B is a hard-core predicate for ƒ then Gl is indeed pseudorandom. Again, we'll use an argument by contradiction.
Assume that Gl is not a pseudorandom generator; that is, there exists circuit C of polynomial size that distinguishes Gl(x) =ƒ(x).B(x) from Ul+1 with advantage ≥ε, where ε is non-negligible. Note, that since ƒ(x) is a permutation, then if x is drawn from uniform distribution, then so if ƒ(x). Therefore, Ul+1 is equivalent to ƒ(x).b, where b is a bit drawn independently from a uniform distribution. Formally,
Probx~U [C(G(x))=1] − Probx~U,b~U [C(x.b)=1] ≥ ε
Let's construct the following algorithm C:
1. Given z=f(x) guess bit b
2. Run C on z.b
3. IF C(z.b)=1
4. output b
5. ELSE
6. output 1-b
Given the output of ƒ the algorithm first guesses bit b by tossing a random coin, i.e. Prob[b=0] = Prob[b=1] = 0.5. Then, algorithm (circuit) C is run on f(x).b and if the result is 1 then b is outputted, otherwise the inverse of b is returned.
Then probability of C guessing B(x) correctly is:
Probx~U [C(z)=B(x)] =
Prob[b=B(x) ∧ C(z.b)=1] +
Prob[b≠B(x) ∧ C(z.b)=0] =
Prob[b=B(x)]⋅Prob[C(z.b)=1 | b=B(x)] +
Prob[b≠B(x)]⋅Prob[C(z.b)=0 | b≠B(x)] =
1/2⋅Prob[C(z.b)=1 | b=B(x)] +
1/2⋅Prob[C(z.b)=0 | b≠B(x)] =
(1−1/2)⋅Prob[C(z.b)=1 | b=B(x)] +
1/2⋅(1−Prob[C(z.b)=1 | b≠B(x)]) =
1/2+Probz.b~G(x) [C(z.b)=1] −
1/2⋅(Prob[C(z.b)=1 | b=B(x)]+Prob[C(z.b)=1 | b≠B(x)]) =
1/2+Probz.b~G(x) [C(z.b)=1] −
Probz.b~U [C(z.b)=1] ≥ 1/2+ε
This implies that circuit C can predict B(x) with probability more than 1/2 + ε, which means that B cannot be a hard-core predicate for ƒ and the hypothesis is contradicted. Q.E.D.
OWP → hard-core predicate
The outline of the proof is as follows:
If ƒ{0,1}n→{0,1}n is a one-way permutation, then so is ƒ'{0,1}2n→{0,1}2n, where ƒ'(x,y)=ƒ(x).y by definition. Then B(x,y)=x⋅y is a hard-core predicate for ƒ', where ⋅ is a vector dot product
Dot product
In mathematics, the dot product or scalar product is an algebraic operation that takes two equal-length sequences of numbers and returns a single number obtained by multiplying corresponding entries and then summing those products...
. To prove that it is indeed hard-core let's assume otherwise, and show a contradiction with the hypothesis of ƒ being one-way. If B is not a hard-core predicate, then there exists a circuit C that predicts it, so
Probx,y[C(ƒ(x),y)=x⋅y] ≥
1/2+ε. That fact can be used to recover x by cleverly constructing permutations y that isolate bits in x. In fact, for a constant fraction of x, there exists a polynomial time algorithm that lists O(1/&epsilon2) candidates that include all valid x. Thus, an algorithm can invert ƒ(x) in polynomial time for a non-negligible fraction of x, which contradicts the hypothesis.