Math: Factoring Details |
Square | Remove top bit | Optional mul by 2 | mod 47 |
1*1 = 1 | 1 0111 | 1*2 = 2 | 2 |
2*2 = 4 | 0 111 | no | 4 |
4*4 = 16 | 1 11 | 16*2 = 32 | 32 |
32*32 = 1024 | 1 1 | 1024*2 = 2048 | 27 |
27*27 = 729 | 1 | 729*2 = 1458 | 1 |
Thus, 223 = 1 mod 47. Subtract 1 from both sides. 223-1 = 0 mod 47. Since we've shown that 47 is a factor, 223-1 is not prime.
One very nice property of Mersenne numbers is that any factor q of 2P-1 must be of the form 2kp+1. Furthermore, q must be 1 or 7 mod 8. A proof is available. Finally, an efficient program can take advantage of the fact that any potential factor q must be prime.
The GIMPS factoring code creates a modified sieve of Eratosthenes with each bit representing a potential 2kp+1 factor. The sieve then eliminates any potential factors that are divisible by prime numbers below 40,000 or so. Also, bits representing potential factors of 3 or 5 mod 8 are cleared. This process eliminates roughly 95% of potential factors. The remaining potential factors are tested using the powering algorithm above.
Now the only question remaining is how much trial factoring should be done? The answer depends on three variables: the cost of factoring, the chance of finding a factor, and the cost of a primality test. The formula used is:
factoring_cost < chance_of_finding_factor * 2 * primality_test_cost
That is, the time spent factoring must be less than the expected time saved. If a factor is found we can avoid running both the first-time and double-check primality tests.
Looking at past factoring data we see that the chance of finding a factor between 2X and 2X+1 is about 1/x. The factoring cost and primality test costs are computed by timing the program. At present, the program trial factors to these limits:
Exponents up to | Trial factored to |
3960000 | 260 |
5160000 | 261 |
6515000 | 262 |
8250000 | 263 |
13380000 | 264 |
17850000 | 265 |
21590000 | 266 |
28130000 | 267 |
35100000 | 268 |
44150000 | 269 |
57020000 | 270 |
71000000 | 271 |
79300000 | 272 |
This method when adapted to Mersenne numbers is even more effective. Remember, that the factor q is of the form 2kp+1. It is easy to modify the P-1 method such that it will find the factor q whenever k is highly composite.
The P-1 method is quite simple. First, pick a bound B1. P-1 will find the factor q as long as all factors of k are less than B1 (k is called B1-smooth). Second, compute E - the product of all primes less than B1. Third, compute x = 3E*2*P. Finally, check the GCD (x-1, 2P-1) to see if a factor was found.
There is an enhancement to Pollard's algorithm that uses a second bound B2 and will find the factor q if k has just one factor between B1 and B2 and all remaining factors are below B1. GIMPS has used this method to find some impressive factors. For example:
22944999-1 is divisible by 314584703073057080643101377. 314584703073057080643101377 is 2 * 53409984701702289312 * 2944999 + 1. The value k, 53409984701702289312, is very smooth: 53409984701702289312 = 2^5 * 3 * 19 * 947 * 7187 * 62297 * 69061
So how does GIMPS intelligently choose B1 and B2? We use a variation of the formula used in trial factoring. We must maximize:
chance_of_finding_factor * 2 * primality_test_cost - factoring_cost
The chance of finding a factor and the factoring cost both vary with different B1 and B2 values. Dickman's function (see Knuth's Art of Computer Programming vol. 2) is used to determine the probability of finding a factor that is k is B1-smooth or B1-smooth with just one factor between B1 and B2. The program tries many values of B1 and if there is sufficient available memory several values of B2, selecting the B1 and B2 values that maximize the formula above.