Pollard's rho algorithm
Encyclopedia
Pollard's rho algorithm is a special-purpose integer factorization
algorithm
. It was invented by John Pollard
in 1975. It is particularly effective at splitting composite numbers with small factors.
and on the observation that (as in the birthday problem) two numbers x and y are congruent modulo p with probability 0.5 after numbers have been randomly chosen. If p is a factor of n, the integer we are aiming to factor, then since p divides both and .
The rho algorithm therefore uses a function modulo n as a generator of a pseudo-random sequence. It runs one sequence twice as "fast" as the other; i.e. for every iteration made by one copy of the sequence, the other copy makes two iterations. Let x be the current state of one sequence and y be the current state of the other. The GCD
of |x − y| and n is taken at each step. If this GCD ever comes to n, then the algorithm terminates with failure, since this means x = y and therefore, by Floyd's cycle-finding algorithm, the sequence has cycled and continuing any further would only be repeating previous work.
Output: a non-trivial factor of n, or failure.
Note that this algorithm may not find the factors and will return failure for composite n. In that case, use a different f(x) and try again.
Note, as well, that this algorithm does not work when n is a prime number, since, in this case, d will be always 1.
published a faster variant of the rho algorithm. He used the same core ideas as Pollard but a different method of cycle detection, replacing Floyd's cycle-finding algorithm
with the related Brent's cycle finding method.
A further improvement was made by Pollard and Brent. They observed that if , then also for any positive integer b. In particular, instead of computing at every step, it suffices to define z as the product of 100 consecutive terms modulo n, and then compute a single . A major speed up results as 100 gcd steps are replaced with 99 multiplications modulo n and a single gcd. Occasionally it may cause the algorithm to fail by introducing a repeated factor, for instance when n is a square. But it then suffices to go back to the previous gcd term, where , and use the regular Rho algorithm from there.
of the same size (10023859281455311421), the same workstation using the original rho algorithm took 109 milliseconds to find a factor; the Richard Brent variant took 31 milliseconds.
For f, we choose a polynomial with integer coefficients. The most common ones are of the form:
The rho algorithm's most remarkable success has been the factorization of the eighth Fermat number by Pollard and Brent. They used Brent's variant of the algorithm, which found a previously unknown prime factor. The complete factorization of F8 took, in total, 2 hours on a UNIVAC
1100/42
.
97 is a non-trivial factor of 8051. Other values of c may give the cofactor (83) instead of 97.
If n is a product of two distinct primes of equal length, running the algorithm for O
(n1/4 polylog(n)) steps yields a factor with probability roughly half. (Note that this is a heuristic claim, and rigorous analysis of the algorithm remains open.)
It gets the ρ name because the iteration of the elements looks like a circle with a tail when sketched.
Integer factorization
In number theory, integer factorization or prime factorization is the decomposition of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer....
algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...
. It was invented by John Pollard
John Pollard (mathematician)
John M. Pollard is a British mathematician who has invented algorithms for the factorization of large numbers and for the calculation of discrete logarithms....
in 1975. It is particularly effective at splitting composite numbers with small factors.
Core ideas
The rho algorithm is based on Floyd's cycle-finding algorithmFloyd's cycle-finding algorithm
In computer science, cycle detection is the algorithmic problem of finding a cycle in a sequence of iterated function values.For any function ƒ that maps a finite set S to itself, and any initial value x0 in S, the sequence of iterated function values x_0,\ x_1=f,\ x_2=f,\ \dots,\ x_i=f,\ \dotsmust...
and on the observation that (as in the birthday problem) two numbers x and y are congruent modulo p with probability 0.5 after numbers have been randomly chosen. If p is a factor of n, the integer we are aiming to factor, then since p divides both and .
The rho algorithm therefore uses a function modulo n as a generator of a pseudo-random sequence. It runs one sequence twice as "fast" as the other; i.e. for every iteration made by one copy of the sequence, the other copy makes two iterations. Let x be the current state of one sequence and y be the current state of the other. The GCD
Greatest common divisor
In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...
of |x − y| and n is taken at each step. If this GCD ever comes to n, then the algorithm terminates with failure, since this means x = y and therefore, by Floyd's cycle-finding algorithm, the sequence has cycled and continuing any further would only be repeating previous work.
Algorithm
Inputs: n, the integer to be factored; and f(x), a pseudo-random function modulo nOutput: a non-trivial factor of n, or failure.
- x ← 2, y ← 2; d ← 1
- While d = 1:
- x ← f(x)
- y ← f(f(y))
- d ← GCD(|x − y|, n)
- If d = n, return failure.
- Else, return d.
Note that this algorithm may not find the factors and will return failure for composite n. In that case, use a different f(x) and try again.
Note, as well, that this algorithm does not work when n is a prime number, since, in this case, d will be always 1.
Variants
In 1980, Richard BrentRichard Brent (scientist)
Richard Peirce Brent is an Australian mathematician and computer scientist, born in 1946. He holds the position of Distinguished Professor of Mathematics and Computer Science with a joint appointment in the Mathematical Sciences Institute and the College of Engineering and Computer Science at...
published a faster variant of the rho algorithm. He used the same core ideas as Pollard but a different method of cycle detection, replacing Floyd's cycle-finding algorithm
Floyd's cycle-finding algorithm
In computer science, cycle detection is the algorithmic problem of finding a cycle in a sequence of iterated function values.For any function ƒ that maps a finite set S to itself, and any initial value x0 in S, the sequence of iterated function values x_0,\ x_1=f,\ x_2=f,\ \dots,\ x_i=f,\ \dotsmust...
with the related Brent's cycle finding method.
A further improvement was made by Pollard and Brent. They observed that if , then also for any positive integer b. In particular, instead of computing at every step, it suffices to define z as the product of 100 consecutive terms modulo n, and then compute a single . A major speed up results as 100 gcd steps are replaced with 99 multiplications modulo n and a single gcd. Occasionally it may cause the algorithm to fail by introducing a repeated factor, for instance when n is a square. But it then suffices to go back to the previous gcd term, where , and use the regular Rho algorithm from there.
Application
The algorithm is very fast for numbers with small factors. For example, on a 3 GHz workstation, the original rho algorithm found the factor 274177 of the sixth Fermat number (18446744073709551617) in 26 milliseconds; the Richard Brent variant found the same factor in 5 milliseconds. However, for a semiprimeSemiprime
In mathematics, a semiprime is a natural number that is the product of two prime numbers. The first few semiprimes are 4, 6, 9, 10, 14, 15, 21, 22, 25, 26, ... ....
of the same size (10023859281455311421), the same workstation using the original rho algorithm took 109 milliseconds to find a factor; the Richard Brent variant took 31 milliseconds.
For f, we choose a polynomial with integer coefficients. The most common ones are of the form:
The rho algorithm's most remarkable success has been the factorization of the eighth Fermat number by Pollard and Brent. They used Brent's variant of the algorithm, which found a previously unknown prime factor. The complete factorization of F8 took, in total, 2 hours on a UNIVAC
UNIVAC
UNIVAC is the name of a business unit and division of the Remington Rand company formed by the 1950 purchase of the Eckert-Mauchly Computer Corporation, founded four years earlier by ENIAC inventors J. Presper Eckert and John Mauchly, and the associated line of computers which continues to this day...
1100/42
UNIVAC 1110
The UNIVAC 1110 was the fourth member of Sperry Rand's UNIVAC 1100 series of computers, introduced in 1972.The UNIVAC 1110 had enhanced multiprocessing support: sixteen-way memory access allowed up to six CAUs and four IOAUs The UNIVAC 1110 was the fourth member of Sperry Rand's UNIVAC 1100 series...
.
Example factorization
Let n = 8051 and f(x) = (x2 + 1 ) mod 8051.i | xi | yi | GCD(|xi − yi|, 8051) |
---|---|---|---|
1 | 5 | 26 | 1 |
2 | 26 | 7474 | 1 |
3 | 677 | 871 | 97 |
97 is a non-trivial factor of 8051. Other values of c may give the cofactor (83) instead of 97.
Complexity
The algorithm offers a trade-off between its running time and the probability that it finds a factor.If n is a product of two distinct primes of equal length, running the algorithm for O
Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
(n1/4 polylog(n)) steps yields a factor with probability roughly half. (Note that this is a heuristic claim, and rigorous analysis of the algorithm remains open.)
It gets the ρ name because the iteration of the elements looks like a circle with a tail when sketched.