### Asymptote Considered Dangerous

```
When N is in a practical range , we should of
course be careful not to take such asymptotic
estimates too seriously. - Prof. Donald E. Knuth

In his well-regarded and popular text ,
"Prime Numbers and Computer Methods for Factorization" ,
Hans Riesel includes a section titled
"The Number of Prime Factors of Large Numbers" ,
where /Note 1/ , citing /Note 2/ , he states some work from Kac ,
Hardy and Ramanujan , which is the focus of this article :
[ all emphasis original ]
the number of prime factors of
almost all integers about N is between
-------------------
(1-eps)ln ln N and (1+eps) ln ln N,
for every eps > 0.  And
almost all integers means
-------------------
a fraction of all integers
as close to 1 as we wish.
------------------------

In the vicinity where  loge loge N  is about 1 , we have
14 = 2 * 7
15 = 3 * 5
16 = 2^4
This does not seem to hold a high density of numbers in the range
claimed by the statement , i.e. , with exactly 1 prime factor .

So , let's look where  loge loge N  is about 2 .
omega  OMEGA                  N
3      4               1602  =  2 * 3^2 * 89
2      2               1603  =  7 * 229
2      3               1604  =  2^2 * 401
3      3               1605  =  3 * 5 * 107
3      3               1606  =  2 * 11 * 73
1      1               1607  =  1607
3      5               1608  =  2^3 * 3 * 67
1      1               1609  =  1609
4      4               1610  =  2 * 5 * 7 * 23
2      3               1611  =  3^2 * 179
3      4               1612  =  2^2 * 13 * 31
1      1               1613  =  1613
3      3               1614  =  2 * 3 * 269
3      3               1615  =  5 * 17 * 19
2      5               1616  =  2^4 * 101
3      4               1617  =  3 * 7^2 * 11
2      2               1618  =  2 * 809
1      1               1619  =  1619
3      7               1620  =  2^2 * 3^4 * 5
1      1               1621  =  1621
2      2               1622  =  2 * 811
2      2               1623  =  3 * 541
3      5               1624  =  2^3 * 7 * 29
2      4               1625  =  5^3 * 13
3      3               1626  =  2 * 3 * 271
1      1               1627  =  1627
3      4               1628  =  2^2 * 11 * 37
2      3               1629  =  3^2 * 181
3      3               1630  =  2 * 5 * 163
2      2               1631  =  7 * 233
3      7               1632  =  2^5 * 3 * 17
2      2               1633  =  23 * 71
3      3               1634  =  2 * 19 * 43
Here only 11 of the 33 listed have exactly 2 factors .
Hardly "almost all" .

So , let's look where  loge loge N  is about 3 .
omega  OMEGA                  N
6      6          528491310  =  2 * 3 * 5 * 67 * 241 * 1091
3      3          528491311  =  73 * 691 * 10477
5      8          528491312  =  2^4 * 41 * 47 * 61 * 281
4      8          528491313  =  3^2 * 7^4 * 37 * 661
Here only 1 of the 4 listed has exactly 3 factors , but let's look at a
larger sample .
For the 2000 integers nearest  e ^ e^3 , here is the distribution of
omegas :
omega             count
1                 113
2                 387
3                 661
4                 536
5                 248
6                  49
7                   6
The central tendency is clear ,
but 661 out of 2000 is hardly "almost all" .

So , let's look where  loge loge N  is about 3.5 .
omega  OMEGA                  N
2      2    240911788282549  =  284423 * 847019363
6      7    240911788282550  =  2 * 5^2 * 7 * 11 * 114473 * 546631
4      5    240911788282551  =  3^2 * 449 * 617 * 96623783
4      6    240911788282552  =  2^3 * 47 * 139 * 4609516843
None of these 4 listed have nearly 3.5 factors .
For the 8000 integers nearest  e ^ e^3.5 , here is the distribution of
omegas :
omega             count
1                 243
2                1095
3                2142
4                2281
5                1442
6                 597
7                 167
8                  30
9                   3

An extreme case is where  loge loge N  is equal to an integer plus .5 .
This is the case where the interval about the normal value determined by
eps  must be large enough to overlap the two adjacent integers , for
there to be even a theoretical possibility for the statement to be true .
How large must  N  then be ?

If  frac loge loge N  =  .5 ,
then  1  <  ( 1+eps ) loge loge N  -  ( 1-eps ) loge loge N
and  1  <  2 * eps * loge loge N
and  loge loge N  >  .5 / eps .

For  eps = .01 , loge loge N  >  50  and
N  >  e ^ 5184705528587072464087.45  ~=  10 ^ 2251689001358648043629.43

For any reasonable  eps , for the statement to be true ,  N  must be
enormous .
Let alone "for every eps > 0" .
For numbers small enough to be written down in a book or stored in a
computer , the statement is not true .

Asymptotic analysis is very useful because the asymptotic behavior of a
function can provide penetrating insight into the behavior of anything
which it describes .
There is a realm within which the statement is true and
there is another realm within which the statement is not true .
The former realm is large , but hopelessly impractical .
The latter realm is small , but practical .
When we consider Euclid's theorem /Note 3/ ,
"There are an infinite number of primes" , we do not imagine that it
applies to any , necessarily finite , list we might construct .

One of the major applications of asymptotic analysis is in evaluating
the feasibility or scaling of algorithms .

Factorization , a major focus of Riesel's text /Note 1/ , makes an
illustrative case study .
For a large N , about which nothing is known of its factors , the first
line of attack is with trial division .
To find if small factors exist , trial division is highly efficient ,
far faster than any other known method .
But asymptotically , trial division is much slower than other known
algorithms .
Thus , asymptotic analysis suggests that trial division should be
limited .
Pollard's rho and p-1 may also be useful to find any somewhat larger ,
but still relatively small , factors .
The asymptotically faster ECM is a present-day workhorse to find still
larger factors .
Still faster , asymptotically , than ECM is QS , also widely used on
larger numbers .
If N is not too large , e.g. , less than about 200 digits , the
asymptotically fastest practical method known , NFS , may produce a
complete factorization , as is the case with QS , even if both or all
three factors are near  N ^ .5 or N ^ 1/3 , and may do so faster than
ECM or QS .
For very large  N , exhausting multiple iterations of ECM , in the hope
of finding relatively small factors up to about 70 digits , may be the
only practical approach .
Various strategies have been proposed to optimize switching between
these methods and deciding when to test for primality of the remaining
co-factor .
Asymptotic analysis powerfully informs these strategies , but provides
no insight into the relative usefulness of the various algorithms .

Analysis will be informed by asymptotes , but will not be captive to
them .
In practical realms , sometimes they may safely be ignored altogether .

---

I thank Prof. Edsger W. Dijkstra for inspiring the title of this
article /Note 4/.

Herein omega is a substitute for the lower-case Greek letter .
OMEGA is a substitute for the upper-case Greek letter .
eps is a substitute for the lower-case Greek letter , epsilon .

/1/ "Prime Numbers and Computer Methods for Factorization" ,
by Hans Riesel , 2e , Birkhauser , 1994 , pp 157 , 159 .
/2/ "Statistical Independence in Probability Analysis and Number
Theory" , by Mark Kac , 1959 , pp 71-74 , credits Hardy and Ramanujan
with the original statement and its characterization .
Erd"os and Kac proved a related theorem , stated on p. 75 .
/3/ "Elements" , by Euclid , c. 300 B.C.E. , Book IX , Proposition 20 .
/4/ "Go-To Statement Considered Harmful" , by E. W. Dijkstra , Letter to
the Editor , "Communications of the ACM" , 1968 March .
/5/ "Seminumerical Algorithms" , by D. E. Knuth , 3e , Addison-Wesley ,
1998 , p. 399 .

Walter Nissen
2007-10-21
```