Project Euler Solution 12: Highly divisible triangular number

Martin Ueding

2023-04-29

Code & Zahlen

Here we have another entry in the Project Euler series, this time about Problem 12: Highly divisible triangular number which is about the divisors of triangle numbers.

The sequence of triangle numbers is generated by adding the natural numbers. So the 7th triangle number would be 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28. The first ten terms would be:

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...

Let us list the factors of the first seven triangle numbers:
 1: 1
 3: 1,3
 6: 1,2,3,6
10: 1,2,5,10
15: 1,3,5,15
21: 1,3,7,21
28: 1,2,4,7,14,28
We can see that 28 is the first triangle number to have over five divisors.

What is the value of the first triangle number to have over five hundred divisors?

At first, this seemed easy to me. First I create a generator with these triangle numbers. Then I would just use the get_prime_factors function from Solution 5: Smallest multiple and use that to get the prime factors and their multiplicities. The total number of divisors can be computed by multiplying each multiplicity plus 1. This works because when I have a certain prime factor $f$ with a multiplicity of $m$ times, I can compose $m + 1$ different numbers from that, namely $f^0$ to $f^m$. And this works independently with all different factors, such that I get this number of divisors: $$ \sum_{f} (m_f + 1) \,. $$

And this is the program to match that idea:

def triangle_number_generator() -> Iterator[int]:
    accumulator = 0
    for i in itertools.count(1):
        accumulator += i
        yield accumulator


def solution() -> int:
    for triangle_number in triangle_number_generator():
        prime_factors = get_prime_factors(triangle_number)
        num_divisors = functools.reduce(
            lambda a, b: a * b, map(lambda x: x + 1, prime_factors.values()), 1
        )
        if num_divisors > 500:
            print(triangle_number, dict(prime_factors.items()), num_divisors)
            return triangle_number

It does work, it produces the correct answer of 76,576,500. However, it takes 1:08 min to compute this answer. And that is way too long, especially given that my laptop is much faster than the reference that they have used when the problem was new. There must be something better to solve this problem.

Caching prime factors

One issue that this implementation has is that the prime numbers are computed for each call to get_prime_factors. This is something that is just a waste of time. So I go back to the prime_generator and change it such that it keeps a cache of the primes. It first serves from the cache, and only then sets out to extend the cache if further elements are requested from the generator. This removes redundant work.

def prime_generator(_primes = []) -> int:
    yield from _primes
    start = 2 if not _primes else _primes[-1] + 1
    for candidate in itertools.count(start):
        for prime in _primes:
            if candidate % prime == 0:
                break
        else:
            yield candidate
            _primes.append(candidate)

This function now has a mutable default argument, which is dangerous. I have let it start with an underscore to show that one should not use this.

And for the timings we need to clear the cache in order to get fair timings. We do this with this line in the solution:

    prime_generator.__defaults__ = ([],)

Now we can get the correct answer in 615 ms, which is much nicer. If one doesn't clear the cache and gets an unfair head start, it only takes around 450 ms to complete.

Coprime property

After having this, I looked into the solution that is available on the site once you put in the right number. And there is another very interesting approach. The triangular numbers have the property that one can express them via closed form: $$ T_n = \frac{n(n+1)}{2} \,. $$ What we can see here is that they are (half of) the product of two consecutive numbers. Consecutive numbers are coprime, which interestingly is only written in the Simple English Wikipedia, not the regular one. This means that the numbers $n$ and $n+1$ don't have any common denominator except for 1. There is a simple proof for this theorem, which I am going to reproduce here.

Say that the greatest common denominator of $n$ and $n+1$ is $p$. This means that $n / p$ is an integer number and $(n+1)/p$ is also an integer. We can subtract these integers from each other and have the following: $$ \frac{n+1}p + \frac np = \frac 1p \,.$$ Since we have subtracted two integers from each other, we necessarily need to get an integer as a result. But the only $p$ which can make $1/p$ an integer is $p = 1$. And hence the greatest common denominator of $n$ and $n+1$ must be $p = 1$. The consecutive numbers are coprime.

The number of factors of a product of two coprime numbers is just the product of the number of factors because there are no common factors in between. Using the closed form expression $T_n = n(n+1)/2$ for the triangle numbers, we can see how the problem falls apart into two smaller problems. With the division by 2 we need to consider two cases here.

If $n$ is even, then we have the numbers $n/2$ and $n+1$. We know that $n+1$ doesn't have 2 as a divisor at all, so they are independent still.
If $n$ is odd, then $n+1$ is even. We then take $n$ and $(n+1)/2$ as the two numbers.

And then we can put this into the code to exploit this property.

def get_num_divisors(number: int) -> int:
    prime_factors = get_prime_factors(number)
    num_divisors = functools.reduce(
        lambda a, b: a * b, map(lambda x: x + 1, prime_factors.values()), 1
    )
    return num_divisors


def solution_coprime() -> int():
    prime_generator.__defaults__ = ([],)
    for n in itertools.count(1):
        triangle_number = n * (n + 1) // 2
        if triangle_number % 2 == 0:
            num_divisors = get_num_divisors(n // 2) * get_num_divisors(n + 1)
        else:
            num_divisors = get_num_divisors((n + 1) // 2) * get_num_divisors(n)
        if num_divisors > 500:
            return triangle_number

With these changes we get the correct answer in 59 ms, which is around 10 times faster than what we had before.