Creating a Brainfuck interpreter

In the preceding post about Brainfuck we have taken a look into this esoteric programming language and written a few programs. Today we are going to take a look into writing our own interpreter of the language.

The Brainfuck language is basically the P'' language with input and output. And that P'' is a language for a Turing machine. So we do need an implementation of the tape. I don't want to set it up with a fixed size, so it will just grow to the right as much as it needs.

The tape needs to support basic operations:

  • Keep track of the position. Go left and right on the tape, expanding as needed and also removing empty (zero) cells on the left.
  • Increment and decrement the number at the current position.
  • Determine whether the current cell is zero.
  • Input and output of the cells, to be used for input and output in the state machine.

There are a couple of invariants:

  • The position must not go further left than 0.
  • The tape must be long enough for the current position.
  • There may not be trailing zeros on the tape, except when the cursor is there.
  • Cells must not be decremented to negative numbers.

We can encapsulate these things in a class that wraps the tape.

class RightInfiniteTape:
    def __init__(self) -> None:
        self._tape = [0]
        self._cursor = 0

    def right(self) -> None:
        self._cursor += 1
        if self._cursor == len(self._tape):
            self._tape.append(0)

    def left(self) -> None:
        assert self._cursor > 0
        if self._cursor == len(self._tape) - 1 and self.is_zero():
            self._tape.pop()
        self._cursor -= 1

    def increment(self) -> None:
        self._tape[self._cursor] += 1

    def decrement(self) -> None:
        assert not self.is_zero()
        self._tape[self._cursor] -= 1

    def is_zero(self) -> int:
        return self.get() == 0

    def get(self) -> int:
        return self._tape[self._cursor]

    def set(self, number: int) -> None:
        self._tape[self._cursor] = number

Testing the tape

In order to verify that the tape works, we can write a simple test with the increment.

def test_tape_increment() -> None:
    tape = RightInfiniteTape()

    tape.increment()
    assert not tape.is_zero()
    tape.decrement()
    assert tape.is_zero()

This passes, so we can go on and test the next thing.

We also need to test the movement and also the truncation of the tape. This is a white-box test because I look into the _tape field although that is private.

def test_tape_movement() -> None:
    tape = RightInfiniteTape()

    tape.right()
    assert tape._tape == [0, 0]
    tape.increment()
    assert tape._tape == [0, 1]
    tape.left()
    assert tape._tape == [0, 1]
    tape.right()
    assert tape._tape == [0, 1]
    tape.decrement()
    assert tape._tape == [0, 0]
    tape.left()
    assert tape._tape == [0]

As this was not developed in a test driven way the tests are not very good.

Implementing the state machine

With the tape available, we can implement the state machine. The state machine takes a program and inputs. Then we can go and step through the program.

Most instructions are rather straightforward and can be forwarded to the tape. The interesting part are the loops with the brackets. There is no need to parse the program into an elaborate structure. We can just count the brackets and move within the program until we have reached the matching bracket.

class StateMachine:
    def __init__(self, program: str, inputs: list[int]) -> None:
        self._program = program
        self._inputs = inputs

        self._tape = RightInfiniteTape()
        self._program_counter = 0
        self._outputs = []

    def step(self) -> None:
        instruction = self._program[self._program_counter]
        if instruction == "+":
            self._tape.increment()
        elif instruction == "-":
            self._tape.decrement()
        elif instruction == ">":
            self._tape.right()
        elif instruction == "<":
            self._tape.left()
        elif instruction == ",":
            self._tape.set(self._inputs.pop(0))
        elif instruction == ".":
            self._outputs.append(self._tape.get())
        elif instruction == "[":
            if self._tape.is_zero():
                self._goto_bracket(forward=True)
        elif instruction == "]":
            if not self._tape.is_zero():
                self._goto_bracket(forward=False)

        self._program_counter += 1

    def run(self) -> list[int]:
        while self._program_counter != len(self._program):
            self.step()
        return self._outputs

    def _goto_bracket(self, forward: bool) -> None:
        my_bracket = "[" if forward else "]"
        target_bracket = "]" if forward else "["
        increment = 1 if forward else -1
        bracket_counter = 0
        while True:
            instruction = self._program[self._program_counter]
            if instruction == my_bracket:
                bracket_counter += 1
            elif instruction == target_bracket:
                bracket_counter -= 1
                if bracket_counter == 0:
                    break
            self._program_counter += increment

All in all this class is not that big and it encapsulates the state machine using an external tape.

Testing the state machine

We can test it with the state machine with the multiplication program from the previous blog post. I have changed it slightly to use the input and output arrays.

def test_state_machine() -> None:
    program = """
    >, >,           Accept two inputs into T1 and T2

    <               Goto T1
    [               While T1 is nonzero
        >               Goto T2
        [->+>+<<]       Copy T2 to T3 and T4
        >>              Goto to T4
        [-<<+>>]        Move T4 to T2
        <               Goto to T3
        [-<<<+>>>]      Accumulate T3 onto T0
        <<              Goto to T1
        -               Decrement T1
    ]

    <.              Emit T0
    """
    sm = StateMachine(program=program, inputs=[2, 3])
    result = sm.run()
    assert result == [6]

Conclusion

Yeah, now we have a working Brainfuck interpreter that can be used from within Python. We don't have a command line interface, but that could easily be added as needed.

The tape and the state machine form nice objects and encapsulate a nontrivial amount of invariants.

With that interpreter at hand, we can start to tackle the most difficult part of this series, the transpiler from a higher level language into Brainfuck.