Cryptosystems like Rivest–Shamir–Adleman (RSA) use large primes to construct public/private key pairs. The security of RSA relies on the fact that, in general, it is computationally expensive to identify the prime factors of a number. But if it is so hard to find prime factors, how can it be easy to find prime numbers in general? We cannot simply choose these primes from a long list of known primes. For RSA to be secure there cannot be a predictable pattern in the primes we use. Our primes must come from randomly generated numbers. We need a way to quickly decide if a number is prime.
First we will discuss the probability that a random number is prime. Then we consider ways to check if a number is prime. The obvious approach of just checking for prime factors is much too slow. We will use Fermat’s Little Theorem to quickly test if a number is prime to a very high likelihood. But there is a class of composite numbers, Carmichael numbers, that are excellent at pretending to be prime. The Miller–Rabin primality test is quite good at correctly identifying these imposters by showing that they lead to more square roots of 1 than is allowed mod n if n were prime.
How often is a random number prime?
If you pick a random number between 1 and 100 you have about a 25% chance of choosing a prime. Between 1 and 1000 only about 18% of numbers are prime. For a large number x the proportion of primes between 1 and x can be approximated by
So a random number between 1 and 1 billion has about a 5% chance of being prime. Not bad. But modern cryptosystems like RSA require choosing ridiculously large primes — about 150 digits long. If you pick a random number that is 150 digits long, you have about a 1 in 300 chance of hitting a prime. On average it will take about 180 tries to get a prime 150 digits long.
To establish a single RSA public/private key pair we have to be able to check hundreds of numbers, each at least 150 digits long, to decide if they are prime or not. For the internet to work, this task has to be completed in just seconds. This happens on almost every computer around the world.
We need a computationally efficient way to verify if a number is prime.
Searching for factors is too slow
We have a number n and we want to know if it is prime. One sure way to decide if it’s prime is to search for factors. If we don’t find any then n must be prime. How far do we have to search?.
Supposing n is not prime, let’s have p stand for the smallest prime factor of n. Ether n = p² or n has a larger prime factor q. Therefore, p² is less than or equal to n.
So, to find a factor of the number 136,373, you only need to search up to 369. If you don’t find a factor by that point, then the number must be prime. In fact 136,373 is prime.
Despite the fact that we only need to search up to the square root of a number, using this method to decide if a number is prime takes a tremendous amount of time as the number of digits increases. This is how long it takes to do it in python.
A number with k digits has magnitude around 10 to the power of k. So the algorithm runs in exponential time with respect to the number of digits.
There are better algorithms for finding prime factors but no known algorithm that works in polynomial time. So these types of algorithms are not good for deciding if a number is prime.
The Fermat Primality Test
Our task is the same. We want to decide if n it is prime. The idea of the Fermat Primality Test is to test a set of properties that all primes share but very few composite numbers have. In that way you can accumulate evidence for a number’s primality. You can stop once you have decided that n is almost certainly prime.
This test is based on Fermat’s Little Theorem (FLT) which says, if n is prime, and a is positive less than n, then:
For example, for n =7 and a = 4,
What we can do is attempt to use FLT the other way around — if n satisfies the congruence for a particular a then that makes n a probable prime. We can then check n against other values of a to gather more positive evidence or, if n fails for any value of a, it is not prime.
Let’s get a sense of how well this test works for primes under 100,000. There are 9669 numbers less than 100,000 that satisfy FLT with a = 2. Of these, 9591 are prime. So for numbers less than 100,000, there is less than 1% chance that a number satisfies FLT and is not prime.
Similarly for a = 3, there is less than 1% chance that a number less than 100,000 will satisfy FLT and still not be prime. Combining these results shows there are only 23 non-prime numbers less than 100,000 that satisfy FLT for both a=2 and a=3. That’s 99.8% chance that a number under 100,000 satisfying both conditions is prime.
Unfortunately, the Fermat test is not good enough. There are some composite numbers, although rare, that satisfy Fermat Primality Test for all values of a that they do not share factors with. They are called Carmichael numbers. The number 561 is the first example of such a number. Its prime factors are 3, 11, and 17. It will satisfy FLT for any value of a that doesn’t share any of those factors. This presents a big problem. If you stumble on a Carmichael number you will almost certainly not test enough values of a for the Fermat Primality Test to distinguish it from a prime. The Miller–Rabin Primality Test was designed to identify this class of numbers with much greater frequency.
Miller–Rabin Primality Test
Fact: If n is a prime then the only numbers that are square roots of 1 mod n are +1 or -1.
Let’s see how our Carmichael number 561 handles this criteria with a = 5.
When we take the square root,
Since 67 is not equal to 1 or -1 mod 561, we conclude that 561 is not prime. The Miller–Rabin Primality Test tries to detect extra roots like this one. Here’s how it works:
- Separate the powers of 2 from n-1. Find unique numbers k and m where m is odd.
2. Choose a random base 0 < a < n.
Then n is a probable prime and we stop here.
4. Otherwise, if
Then we keep squaring b until we find an r ≤ k-1 with
Then n is a probable prime and we stop here. On the other hand, if we don’t find such an r, then we are sure that n is not prime.
The Miller–Rabin Primality Test is harder to fool than the Fermat test. Let’s take a closer look at how n=561 fails the test with a=5. Note, 561 =2⁴ (35). Since
step 3 is not satisfied and we move to step 4. We see that none of the squares, 23², 23⁴,23⁸ equal to -1 mod 561. So 561 is composite.
There are still composite numbers are misclassified as probable primes under the Miller–Rabin Primality Test for some values of a. But there are no classes of numbers like Carmichael numbers that are misclassified as probable primes for almost all choices of a. Moreover the test can be done efficiently. It has a time complexity of
Using this algorithm we can find two 150 digit prime numbers by just checking random numbers. It takes about a second.