C++ Cache Decorator

Martin Ueding

2017-06-10

Code & Zahlen

Pure functions only depend on their arguments and do not have any side effects (global variables, file I/O, ...). Their return value is completely determined by their arguments. If a function is expensive to compute and the argument space is small, it makes sense to cache the results to only compute them once in a program.

In order to separate the actual function from the caching, one can use an extra class which serves as a cache decorator. A decorator is a design pattern that wraps a function. The following can be done in C++:

#include <functional>
#include <map>

template <typename R, typename... A>
class CacheDecorator {
  public:
    CacheDecorator(std::function<R(A...)> f) : f_(f) {}

    R operator()(A... a) {
        std::tuple<A...> key(a...);
        auto search = map_.find(key);
        if (search != map_.end()) {
            return search->second;
        }

        auto result = f_(a...);
        map_[key] = result;
        return result;
    }

  private:
    std::function<R(A...)> f_;
    std::map<std::tuple<A...>, R> map_;
};

It takes a referece to an arbitrary function (or functor or lambda) and stores that as a member variable f_. The function really is a mapping between the arguments and the result, so the cached values are stored in a map map_ with std::tuple<A...> as key and R as value. The operator() obtains the exact same signature as the function that has been passed in. It will create a tuple from the arguments and searches the map with that key. If that is found, it is returned. Otherwise the function is evaluated and the result is stored in the map.

This code can be used with two example functions like this:

#include <iostream>

double square(int x) {
    std::cout << "square(" << x << ")" << std::endl;
    return x * x;
}

double mult(int x, float y) {
    std::cout << "mult(" << x << ", " << y << ")" << std::endl;
    return x * y;
}

int main(int argc, char **argv) {
    CacheDecorator<double, int> cd(square);
    std::cout << cd(2) << std::endl;
    std::cout << cd(2) << std::endl;
    std::cout << cd(4) << std::endl;
    std::cout << cd(2) << std::endl;
    std::cout << cd(4) << std::endl;

    CacheDecorator<double, int, float> cd2(mult);
    std::cout << cd2(2, 1.0) << std::endl;
    std::cout << cd2(2, 1.3) << std::endl;
    std::cout << cd2(4, 1.0) << std::endl;
    std::cout << cd2(2, 1.0) << std::endl;
    std::cout << cd2(4, 1.0) << std::endl;
}

The output of this program is the following:

square(2)
4
4
square(4)
16
4
16
mult(2, 1)
2
mult(2, 1.3)
2.6
mult(4, 1)
4
2
4

One can see that the function is only evaluated once per unique argument combination.

There are a couple of caveats:

There is no check in the language that you use this decorator on a pure function. So you might create some bugs that way.
The equality of tuples is checked by value of each type. For pointers the addresses are compared, not the contents. If there are two copies of the same thing at different addresses, the caching will not be aggressive enough. However, if you change the data at some address, you will still get the cached value.
This implementation is not thread-safe. Each thread needs to get its own CacheDecorator instance. One could make this thread-safe with locks but the performance would probably be worse.
The map will grow indefinitely. Perhaps one only wants to keep the last few evaluations, or the most frequent ones. The example above should be extendable it that is desired. Otherwise all cached values are discarded when the CacheDecorator is destroyed.
I am not sure whether the right thing happens when the function takes a rvalue reference. The operator() will expose the rvalue reference and consume the argument passed in. It is not passed as a move to the tuple nor as a move to the function below. An std::forward for the function call might alleviate some unnecessary copies. We cannot move into the tuple since we might need the arguments to call the function later on.
If the function was to take a std::shared_ptr, we would make a copy of that and store that in map_. This might now what the user expects because this will now prolong the lifetime of objects. Functions really needing to retain a copy of a std::shared_ptr might be rare, though.

Passing a std::unique_ptr or other non-copyable objects will make this type of caching impossible. One might be able to extract a weak reference from the these smart pointers and store that instead. If the reference still works, one can use the caching. Otherwise the function is evaluated again.

Passing pointers, non-const references or rvalue references to a function could be an indicator that the function has side effects and that the function should not be used with the cache decorator. This concept might be useful anyway, one just has to be very careful.