Project Euler Solution 72: Counting fractions

Project Euler Problem 72: Counting fractions continues with fractions. We need to count the number of fractions that there are with a given maximum denominator.

Consider the fraction, n/d, where n and d are positive integers. If n<d and HCF(n,d)=1, it is called a reduced proper fraction.

If we list the set of reduced proper fractions for d ≤ 8 in ascending order of size, we get:

1/8, 1/7, 1/6, 1/5, 1/4, 2/7, 1/3, 3/8, 2/5, 3/7, 1/2, 4/7, 3/5, 5/8, 2/3, 5/7, 3/4, 4/5, 5/6, 6/7, 7/8

It can be seen that there are 21 elements in this set.

How many elements would be contained in the set of reduced proper fractions for d ≤ 1,000,000?

A first attempt might again be to generate all fractions, cancel them and put them into a set. This will not work due to the large numbers of fractions involved.

The next idea is to use the logic from Solution 71: Ordered fractions to produce the next fraction. Going through all the fractions like this isn't feasible because it just takes too long. There must be a better way.

This sequence of fractions is no new invention, it comes up under the name of Farey sequence. The linked Wikipedia article has a section about the “next term”.

We can copy the code and throw out the generality that we don't need:

def farey_sequence(n: int) -> Iterator[tuple[int, int]]:
    # Adapted from
    (a, b, c, d) = (0, 1, 1, n)
    yield a, b
    while c <= n:
        k = (n + b) // d
        (a, b, c, d) = (c, d, k * c - a, k * d - b)
        yield a, b

Then we need a function to count the elements in a generator:

def generator_len(sequence: Iterator) -> int:
    result = 0
    for elem in sequence:
        result += 1
    return result

And that then allows us to go through the series, count the number of elements and remove the values 0 and 1 from the beginning and end.

def solution_faster() -> int:
    return generator_len(farey_sequence(1_000_000)) - 2

The problem is that for d ≤ 1,000 it takes 69 ms, for d ≤ 10,000 it takes 6.2 s. So it clearly scales quadratically with the limit for the denominator. The problem requires us to go to 1,000,000, which is a factor 100 away, so it will take around 60,000 seconds, which is around a day. This again cannot be the solution.

In the Wikipedia article about the Farey sequence we also find its relation to the totient function. The length of the Farey sequence $F_n$ is given via the totient function: $$ |F_n| = 1 + \sum_{m=1}^n \phi(m) \,. $$

If we could compute the sum of all totients from $m = 1$ to $n = 1\,000\,000$, then we would have the solution.

We can just try this and write a function for the totient based on the prime factors:

def totient(n: int) -> int:
    prime_factors = get_prime_factors(n)
    result = 1
    for prime, multiplicity in prime_factors.items():
        result *= prime ** (multiplicity - 1) * (prime - 1)
    return result

And then we compute the length of the Farey sequence using that:

def farey_length(n: int) -> int:
    result = 1
    for m in range(1, n + 1):
        result += totient(m)
    return result

The solution then is simple, we just call that function. We also seed the prime generator with the prime seed such that it becomes a bit faster.

def solution() -> int:
    ceiling = 100000
    prime_generator.__default__ = (prime_sieve(ceiling),)
    return farey_length(ceiling) - 2

The problem is that this takes 243 ms for a ceiling of 10,000 and 12 s for 100,000. This means that for the million that we want to have, it will take too long. We need yet another approach.

Totient sum function

There is a special function for the sum of the totients, the Totient summatory function: $$ \Phi(n) := \sum_{k=1}^n \phi(k) \,. $$

There is one version expressed in terms of the Möbius function $\mu$: $$ \Phi(n) = \frac 12 \sum_{k=1}^n \mu(k) \left\lfloor \frac nk \right\rfloor \left( 1 + \left\lfloor \frac nk \right\rfloor \right) \,. $$

This version takes linear time if we already had the values of $\mu$. The function takes three values. It assigns a $+1$ or $-1$ if there are only unique prime factors and assigns a $-1$ if there is an odd number and $+1$ if there is an even number. It takes the value $0$ if a prime factor occurring multiple times.

We can compute that with a sieve like algorithm: Go through all the numbers. If it hasn't been touched yet, it must be a prime number. Set it to $-1$. Then go through all multiples and set them to $-1$ if they are not set yet. Otherwise flip the sign. Then go through all the multiples of the square of the number and set those to zero.

def generate_moebius_function(ceiling: int) -> list[int]:
    moebius = [None] * ceiling
    moebius[1] = 1
    for i in range(2, ceiling):
        # If we found a prime, set it to -1.
        if moebius[i] is None:
            moebius[i] = -1
            # Tick off the multiples.
            for k in range(2 * i, ceiling, i):
                if moebius[k] is None:
                    moebius[k] = -1
                    moebius[k] *= -1
            # Eliminate all squares.
            for k in range(i**2, ceiling, i**2):
                moebius[k] = 0
    return moebius

With these values we can evaluate the totient sum:

def summary_totient(ceiling: int) -> int:
    moebius = generate_moebius_function(ceiling + 1)
    result = 0
    for k in range(1, ceiling + 1):
        floor = ceiling // k
        result += moebius[k] * (floor) * (1 + floor)
    return result // 2

The solution is that evaluated, minus one.

def solution() -> int:
    return summary_totient(1_000_000) - 1

This computes the correct answer in 406 ms, so that is extremely fast compared to the alternatives.

Interestingly in the supplemental material that one can unlock with the correct solution there is no mention of the Möbius function. However, they also use a sieve approach to more efficiently compute the totient function.