Test of various clock implementations

There are a couple ways to measure the real (wall) time that some part of the code has used. I know the standard clock() from time.h and all the chrono from C++11. Then MPI and OpenMP have their own routines as well.

In order to test, I wrote a simple test program which just calls the function in question a bunch of times. There are a couple blocks like this:

start = omp_get_wtime();
for (long run = 0; run != runs; ++run) {
    MPI_Wtime();
}
stop = omp_get_wtime();
pretty_print(start, stop, "omp_get_wtime", runs);

For consistency I have measured all the methods with the OpenMP method. Download the whole program and compile it with :

/usr/lib64/mpich/bin/mpicc clock_test.c -fopenmp -O3

and then let it run with:

/usr/lib64/mpich/bin/mpirun ./a.out

Depending on your system (mine is Fedora 24), the paths might differ. Perhaps mpicc and mpirun are in your path already.

The output on my machine is the following:

Method Seconds per run
MPI_Wtime 3.2167e-08
omp_get_wtime 3.0686e-08
clock 1.4718e-07
none 6.2002e-15

It does not make a difference whether it is compiled with -O3 or without optimizations. The loop is not removed and the functions are not called any faster. One can see that calling those functions has an overhead compared to the empty loop. The overhead of the loop is negligible compare to the time function.

The MPI and OpenMP functions are faster than the built-in clock() function by a factor of 5. One should measure again on the target architecture, but it will probably stay that way.