Currying in C

Martin Ueding

2016-05-11

Code & Zahlen

Currying is not part of the C language. A few alternatives are shown that can be used: Global variables, GCC extensions, code duplication, void pointers, variable argument lists, and general parameter structs. Also C++03 and C++11 examples are given.

At some occasions it might be necessary to mask certain function parameters. Say you have a numerical integration routine like the following:

double integrate(double (*f)(double), const double lower,
                 const double upper, const double step);

This certainly is a reasonable thing to implement. As long as you have functions with a single parameter, that is taking a single double parameter, this is just fine. One quickly encounters function that take multiple arguments:

double func_2d(const double x, const double t);

Here x is the variable that should be integrated over, t is a parameter that is supposed to be fixed during the integration.

The problem that you will then have is that you cannot put func_2d into the integrate function. You will get errors about wrong signatures. Also the integration routine must know which t to choose.

There are a couple ways to solve this problem in plain C.

Global variable

One method is to just make t a global variable. The function func_2d would then only have the parameter double x and could be passed to integrate. A sample usage could then look like this:

double t;

int main() {
    t = 17;
    const double integral = integrate(func_2d, 0, 1, 1e-5);
}

This does work. However, it has a couple major drawbacks:

Encapsulation is broken. Therefore you cannot test the func_2d function on its own. Whatever test case you write will have to control the environment as well.
Thread safety is gone. As long as you use pure functions without any side effects it is perfectly safe to use them in parallel. By creating a dependency on a global variable, you cannot use func_2d in two concurrent threads any more. This might be remedied by using thread-local variables.

Either way you will have a hard time debugging your program.

GCC extention

What you really need here is a closure, that is a function with some state attached to it. In Python, you can indeed do the following:

def func_2d(x, t):
    # ...

def integrate(f, lower, upper, step):
    # ...

def main():
    t = 17
    def wrapper(x):
        return func_2d(x, t)

    integral = integrate(wrapper, 0, 1, 1e-15)

Notice how the wrapper only takes the argument x and takes t from the enclosing scope, the scope of main. The exact same thing, the definition of a function in a function, is also possible in JavaScript. It will create a closure for you and basically copy the t into wrapper. This works in Python as a function is more than just a function, it is a callable object, a functor.

It would be really nice to do this in C as well. Unfortunately, C is too limited to allow for this construction. There is a GCC extension that you can use. With that, you can just write the following:

int main() {
    double t = 17;

    double wrapper(const double x) {
        return func_2d(x, t);
    }

    const double integral = integrate(wrapper, 0, 1, 1e-5);
}

This will work, but on GCC only. You must decide yourself whether using non-standard features is something that you want in your project. This will work just fine on Linux. If you use Windows, you can install GCC via MingW or Cygwin. This also works. On a Mac, you have bad luck, there is only Clang/LLVM. There is a gcc command on Mac OS, but it is just a symlink to clang.

Multiple integration routines

One portable way to go about this is to define a couple more integration routines:

double integrate_0(double (*f)(double), const double lower,
                   const double upper, const double step);

double integrate_1(double (*f)(double, double), const double lower,
                   const double upper, const double step,
                   const double param_1);

double integrate_2(double (*f)(double, double, double), const double lower,
                   const double upper, const double step,
                   const double param_1, const double param_2);

// ...

This way, the additional parameters can be just fed through. For the func_2d case one would use it like this:

int main() {
    const double integral = integrate_1(func_2d, 0, 1, 1e-5, 17);
}

On the first glance, this solution avoids all the trouble that the previous versions had. We can use that in threads, have no global variable. Tests can be written and it works on every compiler. Whenever ones copy-pastes source code, it should raise a flag. When using abstractions properly, it should rarely be necessary to copy-paste source code.

Also this only gets worse when you have more parameters, parameters have other types than double and so on. Additionally, this has to be done for all the numerical routines that you want to write. I would not want to maintain that!

`void` pointer

One really generic way to get is is using a void * for all the additional parameters. The integrate function would then look like this:

double integrate(double (*f)(double, void *), void * params,
                 double lower, double upper, double step);

In order to use it, we need to write a wrapper for func_2d:

double wrapper(const double x, void * const params) {
    double *pt = (double *) params;
    double t = *pt;
    return func_2d(x, t);
}

When using this, we would do the following:

int main() {
    double t = 17;
    const double integral = integrate(wrapper, &t, 0, 1, 1e-5, 17);
}

Using this approach, we can extend it to an arbitrary amount of parameters using a pointer to a struct. Declaring one type of container for each parameter pack and a wrapper for every function, we would get this:

struct t_container {
    double t;
};

double wrapper(const double x, void *const input) {
    const struct t_container *const params = (struct t_container * const)input;
    const double t = params->t;
    return func_2d(x, t);
}

This also works on every compiler and you can test it. It is extensible and the same integration routine can be used for functions with an arbitrary amount of parameters with arbitrary type.

It has some other problems, though. The void * and the casting disables the type system. That can make for nasty bugs that you can only notice at run-time. If you are unlucky, this section only occurs late in your code and will crash the program when you are paying for computing time. The type system is in place to catch those errors, so one should no go against it when possible.

Also one introduces one level of indirection. A common saying is that every problem in computer science can be solved with another layer of indirection. Except for too many indirections, that is a different problem. Here the indirection will prohibit inlining which can speed your code up drastically if the function to evaluate is cheap. The repeated indirection should be cached eventually. I would still like to avoid that, however.

Variable argument list

Then there are variable argument lists (varg). Those are used in functions like printf which use three dots (...) in their definition. Those are not type safe either (This is why one has to get the %lf in scanf correct!). So instead of casting from void *, one will have to use the va_ family of functions.

I have never used that myself, so I cannot give a proper example. But the integration routine would then have the following declaration:

double integrate(double (*f)(double, ...), const double lower,
                 const double upper, const double step, ...);

From that I think I know about this, I would conclude that it is not necessarily better than the version with void *.

See the answer by mfro for an example of this method.

General parameter struct

If you work in a specific domain and the parameters that you pass are always just a few from a given set, one can use a struct that just contains all the possible parameters that you have. Say we only want to do spacial integrations in up to four dimensions of functions that perhaps depend on parameters a and b. Then we could do the following:

struct Parameters {
    double a, b;
    double t;
    double x, y, z;
};

Then we define some density function that depends on 3D space and the parameter a. It does not depend on time t and not on b. That is the function that we want to integrate:

double density(const double x, const double y, const double z,
               const double a) {
    // ...
}

First we want to integrate over x. We define a wrapper function that just takes x and those other parameters p such that we can feed it into the modified integrate_param function:

double wrapper_x(const double x, Parameters p) {
    return density(x, p.y, p.z, p.a);
}

This should demonstrate this method. We can compare it to the other methods. The positive features are:

Thread-safe.
Type-safe.
Portable (does not rely on compiler extensions).
No global variables.
No code replication with slightly changed arguments.

The drawbacks here are rather mild:

The number of parameters used in the program has to be somewhat limited.

One could perhaps start to recycle the names as long as they have the same type. So if all the parameters are double and one commits on not changing the parameters in the Parameters structure, one can use the same variable in different contexts. One just has to be really careful to avoid confusion.
Calling a function with a large struct of unused parameters will result in more copy operations onto the stack before calling the function. This will only become important when the function itself is cheap to evaluate.

One way to get around this would be passing const Parameters *const p instead of just Parameters p as an argument. This will add a layer of indirection and remove the copy. If one uses const, the compiler will still have a fair chance of optimizing.
Function signatures are now rather useless. Every function in the program just takes a parameter scope. One cannot see from the function signature which parameters the function actually uses.

The same problem occurs when using global variables. Here at least it is not a side effect, the dependencies are just in the opaque struct now. Unit tests and the like are not affected by this. That is good!

Integration along x would then look like this:

int main() {
    Parameters p;
    p.a = 17;
    p.y = 0.3;
    p.z = -3.4;

    const double integral = integrate_param(wrapper_x, p, -1, 1, 1e-15);
}

One could even go further and incorporate the first parameter into the struct as well and only pass that.

I would like to thank Bartosz Kostrzewa for making me aware of this approach!

C++11 lambda

You may know that I am not a big fan of C. I much prefer C++ over its more expressive syntax as well as the powerful and often costless abstractions. This is a case where C++11 can give you a closure in a single line:

int main() {
    double t = 17;
    auto wrapper = [t](const double x) { return func_2d(x, t) };
    const double integral = integrate(wrapper, 0, 1, 1e-5);
}

We will have to modify the declaration of integrate slightly for this:

#include <functional>

double integrate(std::function<double(double)> f, const double lower,
                 const double upper, const double step);

Another neat thing is that the syntax of function pointers is much nicer in C++11 and this also captures functors. Here you do not have any problems with type unsafety or threading. It just works. It also does not depend on the compiler, as long as that supports C++11. The IBM compiler on Blue Gene/Q does not support that, so you will have to use bgclang there.

C++03 functor

If you are in the unlucky position to deal with a compiler which does not support C++11, like before GCC 4.8, you will have to write the functor yourself. There is no nice lambda syntax in C++03. So you would create something like this:

struct Wrapper {
    double t;

    Wrapper::Wrapper(const double t) t(t) {};

    double operator()(const double x) { return func_2d(x, t); }
}

Then create an instance of that wrapper and pass it to the integrate function:

int main() {
    Wrapper wrapper(17);
    const double integral = integrate(wrapper, 0, 1, 1e-5);
}

It seems that the functional header has been around in C++03 already, so one can use the same integrate function as with the C++11 example.