Project Euler Solution 79: Passcode derivation

Project Euler Problem 79: Passcode derivation give us three digit samples from a secret passcode.

There is a secret passcode. During each login the operator asks for the digits at three random positions in the passcode. We are given a list of 50 login attempts and the corresponding three digits. We are asked to recover the shortest code which would match the login attempts.

At first I looked at the samples, and it seems that they are only digits. The first two ones are 319 and 680. I set out to look at all possible combinations of these numbers, making them interleaved in all possible ways. One can form 680319, 683019, 638019, 368019, 361890 and many others. The number of them grows pretty quickly.

When I look at 680 and 180, there are the nice combinations 1680 and 6180, but one could form many more, even 680180 and so on. Perhaps only the first two are really sensible, but who knows that in advance.

I kept thinking about this for a while, and eventually gave up. I searched for other solutions online and found this one. There the author takes a look at the digits that are used. And it turns out that only eight unique digits are used:

['0', '1', '2', '3', '6', '7', '8', '9']

The next thing that the author adds is the assumption that there are no repeated digits. This doesn't need to be the case. If there were the samples 123 and 321 in there, the shortest code to get these would be 12321 or 32123. Using this assumption, the problem gets much easier.

The author further takes a look into the relationships between the eight digits. Since one assumes that there are no repetitions, this is unique. We can use a sample like 319 and derive two rules from this: 3 -> 1 and 1 -> 9. We could also derive 3 -> 9, but that actually comes out from transitively and we don't need that.

Taking all these rules and displaying them as a graph, we get this pleasingly simple looking graph:

We can directly read off the solution because the nodes nicely fall into eight layers. It has to be 73162890.

The same result can be found via code using the same logic. We build a mapping with all previous characters and then the take the root of the graph, append it to the passcode solution and prune all edges from the graph.

def iter_samples() -> Iterator[str]:
    with open("data/p079_keylog.txt") as f:
        for line in f:
            sample = line.strip()
            if sample:
                yield sample


def solution() -> None:
    samples = sorted(set(iter_samples()))
    chars = set(char for sample in samples for char in sample)
    previous_char = {char: set() for char in chars}
    for sample in samples:
        for i in range(2):
            previous_char[sample[i + 1]].add(sample[i])
    result = ""
    while previous_char:
        for char, before in previous_char.items():
            if not before:
                result += char
                break
        else:
            assert False
        del previous_char[char]
        for before in previous_char.values():
            if char in before:
                before.remove(char)
    return int(result)

This runs in 48 µs.

I found this problem rather unsatisfying because I needed to figure out that there were additional assumptions.

To show how brittle this solution is, let us take the secret code 583906758391 and generate 50 login attempts from that. The resulting graph would look like this:

This graph has backward references, so there is no clear way to find the start of the string. This is rather disappointing.