Skip to main content

On infinite decimal expansions, missing numbers, and generating functions

(This post is a cleaned up and expanded version of this thread.)


A cool fact I've seen shared around the internet a few times: The decimal expansion of 1/998001 starts with

1998001=0.000001002003996997999

That is, it begins with three-digit strings from 000 to 999, in order, except that it skips 998 for some reason.

The first thing to observe is that 998001=9992.

Recall the formula for the infinite geometric series:

n=0rn=11r.

If we differentiate both sides with respect to r, we get

n=1nrn1=1(1r)2,

and multiplying by r gives

n=1nrn=r(1r)2.

(This can also be obtained by some series manipulations.)

Now, take r=0.001. We have

0.001(10.001)2=1000998001=0.001+0.000002+0.000000003+

From here, the appearance of the numbers from 001 to 997, at least, is clear. What changes at 998?  To see this, we look ahead a couple of places, zooming into this part of the sum. We will write it vertically for visualization purposes.

0.001++0.000997+0.000000998+0.000000000999+0.000000000001000+0.001997999000

As can be seen above, the 1000 term spills over into the 999 term, so when they are added, the 999 becomes 1000, and a 1 is in turn carried over to the 998, yielding 999. Nothing else before is affected. This explains the missing 998.

Of course, the above is the expansion of 1000/998001, which isn't quite our original fraction, and the decimal expansion is missing the 000 string at the beginning. To get the fraction in our original claim, 1/998001, we move three decimal places, which indeed accounts for the missing 000.

More generally, the decimal expansion of 1/(10d1)2 encodes all d-digit strings from 000 to 999 except for 998.


Let's look at our old friend 1/7=0.142857 again. Notice that 28 is two times 14, and 57 is almost two times 28=56 but overshoots by one. Normally, this could be chalked up to coincidence, and for many years, I thought it was. But, it turns out there's a nice explanation! Note that

17=0.1410.02=0.14+0.0028+0.000056+0.00000112+

The first two terms of the sum explain why 14 and 28 show up. And similarly to a while ago, the reason we have 57 and not 56 is because the next term, corresponding to 112, spills over.


One of my favorite variants of the above, if a bit more niche: Notice that

189=0.0112359...19899=0.000101020305081321345590...1998999=0.000001001002003005008013021034055089144233377610988

That is, these fractions seem to encode the notorious Fibonacci sequence {Fn}n=1=F1,F2,, a sequence of integers defined by F1=F2=1,Fn+1=Fn+Fn1. That is, each term is the sum of the previous two. The first few terms are

1,1,2,3,5,8,13,21,34,55,


Before we go too deeply into that, I'll relate a math problem that I was surprised turned up in a local high school contest a couple of years ago: What is the value of the infinite sum

12+14+28+316+532+?

It's pretty reasonable to suppose that the sum converges in the first place. Let S denote the value of this sum. The relation given above seems to be a clue: you somehow want to line up the Fn and Fn1 terms. To do this, we divide everything by 2, shifting the Fns to the right:

S2=14+18+216+332+

Combining the two gives

S+S2=12+(14+14)+(28+18)+(316+216)+=12+24+38+516+

This time, it looks almost like the first sequence, except that the Fns have been shifted to the left; another way of viewing this is that the denominators have been divided by 2, or the original sum has been doubled. Well, almost. We have, to compare,

2S=1+12+24+38+516+

Notice that these expressions differ by a term of 1. That is,

S+S2=2S1.

We can now solve for S; we have S=2, and so

n=1Fn2n=12+14+28+316+532+=2.

More generally, we can consider the sum

S=n=1Fnxn.

Repeating the process above, but replacing 1/2 with x, we have that, assuming the convergence of all relevant series, xS+S=S/x1, which rearranges to S=x/(1xx2). That is,

x1xx2=n=1Fnxn.

More generally, suppose we have a sequence of numbers {an}n=1=a1,a2,a3. The function

f(x)=n=1anxn=a1x+a2x2+

is said to be the generating function of the sequence {an}n=1.

The above tells us that the function x/(1xx2) is the generating function of the Fibonacci sequence. Our previous discussion tells us that x/(1x)2 is the generating function of the positive integers. Note that nothing stops us from starting at n=0 instead of n=1, or whichever index, so in this light we can view 1/(1x) as the generating function of the constant sequence {an}n=0 given by an=1 for all nonnegative integers n.


We can now go back to the matter of the decimal expansions. What the above says is that the fractions

1102d10d1

encode all but the last Fibonacci number with at most d decimal digits. And this can be seen by taking x=10d and plugging it into our generating function; we get that

10d102d10d1=n=1Fn10nd=0.001001002

By a similar observation as in above, the smallest Fibonacci number with d+1 digits spills over into the largest Fibonacci number with at most d digits, explaining what is going on. And as before, the missing 000 string is accounted for by multiplying by a factor of 10d.

There is a minor detail. What if there is more than one overflow? Say, the term corresponding to the first (d+1)-digit Fibonacci number overflows into the term corresponding to the largest d-digit Fibonacci number, which then becomes a (d+1) digit number and overflows into the term before it. This would happen exactly when the largest d-digit Fibonacci number is 10d1. Fortunately, a paper of Bugeaud, Luca, Mignotte, and Siksek shows that this is never the case.


There are more ways in which studying the above generating function is useful. It is an established result that any rational function on the reals can be decomposed into simpler rational functions, whose denominators are powers of irreducible polynomials with degree at most 2. (And if we want to consider rational functions on C instead, we can stipulate that the denominators are all powers of linear factors.) We have that 1xx2=(x2+x1) has roots α=(1+5)/2 and β=(15)/2. So, we can write

x1xx2=Axα+Bxβ

for some constants A,B. Solving gives A=α/(αβ),B=β/(αβ). Note that αβ=5. We have

x1xx2=15(αxα+βxβ)=15(11x/α11x/β)

We now expand the right-hand side into geometric series to get

x1xx2=n=115(1αn1βn)xn.

However, we already agreed that x/(1xx2)=n=1Fnxn; this forces us to conclude that

Fn=15(1αn1βn).

This gives us an explicit formula for Fn that does not depend on knowing any previous values!

We can polish this presentation a tiny bit more. Notice that 1/α=(1+5)/2=φ1.618, the infamous golden ratio; we also have that 1/β=(15)/2=ψ, the other root of the quadratic equation x2x1. With this we write

Fn=φnψn5.

This identity is known as Binet's identity, named after Jacques Philippe Marie Binet, though supposedly Euler, de Moivre, and Daniel Bernoulli had already known about it.


This technique of encoding sequences in generating functions in order to study them is a powerful one. A taste of some more advanced stuff, to end: Denote by H the complex upper-half plane, that is, the set of complex numbers with positive imaginary part. Take Γ to be the set of 2×2 integer matrices (abcd) with adbc=1, that is, of determinant 1.

Now, for γ=(abcd) and τH, let γτ=(aτ+b)/(cτ+d). This action is consistent with matrix multiplication: we have for any τH and γ1,γ2Γ, γ1(γ2τ)=(γ1γ2)τ. It can also be shown that if τH and γΓ, then γτH as well.

Let k be a positive integer, and let f be a holomorphic (complex differentiable) function defined on H. We say that f is a modular form of weight k if it satisfies the following conditions:

  1. For all τH and γ=(abcd)Γ, f(γτ)=(cτ+d)kf(τ);
  2. as Im(τ), |f(τ)| is bounded.

Note that T=(1101)Γ; the first condition implies that f(Tτ)=f(τ+1)=f(τ), so f is periodic of period 1. Since f is holomorphic with period 1, it has a Fourier series of the form n=0cnqn where we write q:=e2πiτ

(In fact, to check 1., it suffices to check it for T and S:=(0110).)


For a prototypical example of a modular form: Let k>1 be a positive integer. Consider the (normalized) Eisenstein series, a function defined on H given by:

E2k(τ)=12ζ(2k)m,nZ2(0,0)1(m+nτ)2k

where ζ(k)=n=1nk is the Riemann zeta function.

The Eisenstein series E2k is a modular form of weight 2k. It has the Fourier series expansion

E2k(τ)=1+C2kn=1σ2k1(n)qn

for some constant C2k depending on k. Here, σk(n)=dndk denotes the sum of the kth powers of the (positive) divisors of n. This tells us that we can view E2k as a generating function for {σ2k1(n)}n=1 (up to a scalar multiple). Some values of C2k are C4=240, C6=504, C8=480.

By using methods from complex analysis, it can be shown that any modular form must be a linear combination of products of the (normalized) Eisenstein series E4 and E6. In particular, when k=4, we can only have that E8=c(E4)2 for some constant c. Comparing constant terms, we conclude that c=1. That is, E8=(E4)2, and so

1+480n=1σ7(n)qn=(1+240k=1σ3(n)qn)2. 

By matching the coefficients of qn, we find

σ7(n)=σ3(n)+120m=1n1σ3(m)σ3(nm).

Check this for n=2 and n=3 if you're skeptical! The first time I learned this material, I honestly didn't believe it myself.

This is an astonishing identity that I am not sure can be attained by more elementary approaches. There are many similar divisor-sum formulas that can be obtained by using these techniques, as well as formulas for the number of ways to write an integer as the sum of 2 squares (or 4, or 8 squares), and many, many other cool things. In general, deriving results in number theory by using techniques of analysis is a whole area of study known as analytic number theory, which is my chief area of interest.

Comments

Popular posts from this blog

A silly little derivation of ζ(2)

(This is a cleaned-up and somewhat expanded version of this Twitter thread .) What follows is a silly little proof that ζ(2)=n=11n2=π26 where ζ is the Riemann zeta function. Consider the integral I:=01log(1x+x2)x(x1)dx. We have, by using partial fractions and performing some other algebraic manipulations, I=01log(1x+x2)xdx01log(1x+x2)1xdx=201log(1x+x2)x(x1x)=2(01log(1+x)xdx01log(1+x3)xdx)=4301log(1+x)xdx(xx1/3). To evaluate this integral, we take the Maclaurin series: 01log(1+x)xdx=01n=1(1)nxn1ndx ...

100 is the only square that is the sum of 4 consecutive (positive) cubes.

The OEIS article on the number 100  opens with an interesting factoid:  "100 is the square of 10, and the smallest square that is the sum of four [positive] consecutive cubes: 13+23+33+43=100." In fact, it is the only one. To see this, let's look at the equation y2=x3+(x+1)3+(x+2)3+(x+3)3=4x3+18x2+42x+36. Let X=4x+6,Y=4y; the above equation then reduces to Y2=X3+60X. Note that a positive integer solution in (x,y) will give a positive integer solution in (X,Y), though the converse is not true. Now, generally speaking, whenever one sees an equation of the form Y2=X3+aX+b, one has an elliptic curve . Well, we have the extra condition 4a3+27b20 to get rid of problematic cases like y2=x3, which are referred to as singular curves ; we'll see the logic of this later on. I will not go too...