Asymptote Considered Dangerous
When N is in a practical range , we should of
course be careful not to take such asymptotic
estimates too seriously. - Prof. Donald E. Knuth
In his well-regarded and popular text ,
"Prime Numbers and Computer Methods for Factorization" ,
Hans Riesel includes a section titled
"The Number of Prime Factors of Large Numbers" ,
where /Note 1/ , citing /Note 2/ , he states some work from Kac ,
Hardy and Ramanujan , which is the focus of this article :
[ all emphasis original ]
the number of prime factors of
almost all integers about N is between
-------------------
(1-eps)ln ln N and (1+eps) ln ln N,
for every eps > 0. And
almost all integers means
-------------------
a fraction of all integers
as close to 1 as we wish.
------------------------
In the vicinity where loge loge N is about 1 , we have
14 = 2 * 7
15 = 3 * 5
16 = 2^4
This does not seem to hold a high density of numbers in the range
claimed by the statement , i.e. , with exactly 1 prime factor .
So , let's look where loge loge N is about 2 .
omega OMEGA N
3 4 1602 = 2 * 3^2 * 89
2 2 1603 = 7 * 229
2 3 1604 = 2^2 * 401
3 3 1605 = 3 * 5 * 107
3 3 1606 = 2 * 11 * 73
1 1 1607 = 1607
3 5 1608 = 2^3 * 3 * 67
1 1 1609 = 1609
4 4 1610 = 2 * 5 * 7 * 23
2 3 1611 = 3^2 * 179
3 4 1612 = 2^2 * 13 * 31
1 1 1613 = 1613
3 3 1614 = 2 * 3 * 269
3 3 1615 = 5 * 17 * 19
2 5 1616 = 2^4 * 101
3 4 1617 = 3 * 7^2 * 11
2 2 1618 = 2 * 809
1 1 1619 = 1619
3 7 1620 = 2^2 * 3^4 * 5
1 1 1621 = 1621
2 2 1622 = 2 * 811
2 2 1623 = 3 * 541
3 5 1624 = 2^3 * 7 * 29
2 4 1625 = 5^3 * 13
3 3 1626 = 2 * 3 * 271
1 1 1627 = 1627
3 4 1628 = 2^2 * 11 * 37
2 3 1629 = 3^2 * 181
3 3 1630 = 2 * 5 * 163
2 2 1631 = 7 * 233
3 7 1632 = 2^5 * 3 * 17
2 2 1633 = 23 * 71
3 3 1634 = 2 * 19 * 43
Here only 11 of the 33 listed have exactly 2 factors .
Hardly "almost all" .
So , let's look where loge loge N is about 3 .
omega OMEGA N
6 6 528491310 = 2 * 3 * 5 * 67 * 241 * 1091
3 3 528491311 = 73 * 691 * 10477
5 8 528491312 = 2^4 * 41 * 47 * 61 * 281
4 8 528491313 = 3^2 * 7^4 * 37 * 661
Here only 1 of the 4 listed has exactly 3 factors , but let's look at a
larger sample .
For the 2000 integers nearest e ^ e^3 , here is the distribution of
omegas :
omega count
1 113
2 387
3 661
4 536
5 248
6 49
7 6
The central tendency is clear ,
but 661 out of 2000 is hardly "almost all" .
So , let's look where loge loge N is about 3.5 .
omega OMEGA N
2 2 240911788282549 = 284423 * 847019363
6 7 240911788282550 = 2 * 5^2 * 7 * 11 * 114473 * 546631
4 5 240911788282551 = 3^2 * 449 * 617 * 96623783
4 6 240911788282552 = 2^3 * 47 * 139 * 4609516843
None of these 4 listed have nearly 3.5 factors .
For the 8000 integers nearest e ^ e^3.5 , here is the distribution of
omegas :
omega count
1 243
2 1095
3 2142
4 2281
5 1442
6 597
7 167
8 30
9 3
An extreme case is where loge loge N is equal to an integer plus .5 .
This is the case where the interval about the normal value determined by
eps must be large enough to overlap the two adjacent integers , for
there to be even a theoretical possibility for the statement to be true .
How large must N then be ?
If frac loge loge N = .5 ,
then 1 < ( 1+eps ) loge loge N - ( 1-eps ) loge loge N
and 1 < 2 * eps * loge loge N
and loge loge N > .5 / eps .
For eps = .01 , loge loge N > 50 and
N > e ^ 5184705528587072464087.45 ~= 10 ^ 2251689001358648043629.43
For any reasonable eps , for the statement to be true , N must be
enormous .
Let alone "for every eps > 0" .
For numbers small enough to be written down in a book or stored in a
computer , the statement is not true .
Asymptotic analysis is very useful because the asymptotic behavior of a
function can provide penetrating insight into the behavior of anything
which it describes .
There is a realm within which the statement is true and
there is another realm within which the statement is not true .
The former realm is large , but hopelessly impractical .
The latter realm is small , but practical .
When we consider Euclid's theorem /Note 3/ ,
"There are an infinite number of primes" , we do not imagine that it
applies to any , necessarily finite , list we might construct .
One of the major applications of asymptotic analysis is in evaluating
the feasibility or scaling of algorithms .
Factorization , a major focus of Riesel's text /Note 1/ , makes an
illustrative case study .
For a large N , about which nothing is known of its factors , the first
line of attack is with trial division .
To find if small factors exist , trial division is highly efficient ,
far faster than any other known method .
But asymptotically , trial division is much slower than other known
algorithms .
Thus , asymptotic analysis suggests that trial division should be
limited .
Pollard's rho and p-1 may also be useful to find any somewhat larger ,
but still relatively small , factors .
The asymptotically faster ECM is a present-day workhorse to find still
larger factors .
Still faster , asymptotically , than ECM is QS , also widely used on
larger numbers .
If N is not too large , e.g. , less than about 200 digits , the
asymptotically fastest practical method known , NFS , may produce a
complete factorization , as is the case with QS , even if both or all
three factors are near N ^ .5 or N ^ 1/3 , and may do so faster than
ECM or QS .
For very large N , exhausting multiple iterations of ECM , in the hope
of finding relatively small factors up to about 70 digits , may be the
only practical approach .
Various strategies have been proposed to optimize switching between
these methods and deciding when to test for primality of the remaining
co-factor .
Asymptotic analysis powerfully informs these strategies , but provides
no insight into the relative usefulness of the various algorithms .
Analysis will be informed by asymptotes , but will not be captive to
them .
In practical realms , sometimes they may safely be ignored altogether .
---
I thank Prof. Edsger W. Dijkstra for inspiring the title of this
article /Note 4/.
Herein omega is a substitute for the lower-case Greek letter .
OMEGA is a substitute for the upper-case Greek letter .
eps is a substitute for the lower-case Greek letter , epsilon .
/1/ "Prime Numbers and Computer Methods for Factorization" ,
by Hans Riesel , 2e , Birkhauser , 1994 , pp 157 , 159 .
/2/ "Statistical Independence in Probability Analysis and Number
Theory" , by Mark Kac , 1959 , pp 71-74 , credits Hardy and Ramanujan
with the original statement and its characterization .
Erd"os and Kac proved a related theorem , stated on p. 75 .
/3/ "Elements" , by Euclid , c. 300 B.C.E. , Book IX , Proposition 20 .
/4/ "Go-To Statement Considered Harmful" , by E. W. Dijkstra , Letter to
the Editor , "Communications of the ACM" , 1968 March .
/5/ "Seminumerical Algorithms" , by D. E. Knuth , 3e , Addison-Wesley ,
1998 , p. 399 .
Walter Nissen
2007-10-21