### About

Date: 2014-07-10### Summary

Although the pow function is an extra function call, compilers can optimize x * x to a point where there is no difference to pow(x, 2). This depends on the compiler and should be tested in benchmarks.### Contents

# Efficiency of the `pow`

function¶

Someone said that using `pow(x, 2)`

is always more inefficient than using ```
x
* x
```

. Well, there are two things to remember:

- Do not “optimize” without measurement.
- Measure with full compiler optimization.

## The test¶

So this is exactly what I did then. This is a simple C++11 program that uses
`pow(x, 2)`

and `x * x`

. I highlighted the lines in questions. The
calculations with `test`

are in place so that the compiler does not optimize
away the code, which it would do otherwise.

```
// Copyright © 2014 Martin Ueding <dev@martin-ueding.de>
// Licensed under The GNU Public License Version 2 (or later)
#include <chrono>
#include <cmath>
#include <iostream>
int main() {
double test {0};
unsigned iter_max {100000000};
auto start_time = std::chrono::steady_clock::now();
for (unsigned iter {0}; iter < iter_max; iter++) {
test += std::pow(iter, 2);
}
double time_in_seconds = std::chrono::duration_cast<std::chrono::milliseconds>
(std::chrono::steady_clock::now() - start_time).count() / 1000.0;
std::cout << "Pow: " << time_in_seconds << std::endl;
start_time = std::chrono::steady_clock::now();
for (unsigned iter {0}; iter < iter_max; iter++) {
test += iter * iter;
}
time_in_seconds = std::chrono::duration_cast<std::chrono::milliseconds>
(std::chrono::steady_clock::now() - start_time).count() / 1000.0;
std::cout << "Multiplication: " << time_in_seconds << std::endl;
std::cout << test << std::endl;
return 0;
}
```

## Results¶

If you compile that without optimization, `pow`

is clearly slower:

```
$ clang++ -std=c++11 pow.cpp -o pow; and ./pow
Pow: 0.272
Multiplication: 0.042
```

Now I compiled this with `clang++`

using its `-O3`

optimization. When I did
this on 2014-05-21, `pow`

was significantly faster. I have revisited this on
2014-07-10, where `pow`

just a tiny bit slower than the multiplication.
Interesting.

I also tested with `g++`

and found that `pow`

is significantly slower than
the multiplication. To be fair, I ran each one a couple times since the overall
time varies. With `g++`

, the multiplication is actually faster. To get
meaningful results, I ran each one 10 times and too mean and standard deviation
with this Python script:

```
#!/usr/bin/python3
# -*- coding: utf-8 -*-
# Copyright © 2014 Martin Ueding <dev@martin-ueding.de>
# Licensed under The GNU Public License Version 2 (or later)
import subprocess
import numpy
import unitprint
def compile_cpp(compiler):
subprocess.check_call([compiler, '-std=c++11', '-O3', 'pow.cpp', '-o', 'pow'])
def get_results():
words = subprocess.check_output(['./pow']).decode().strip().split()
return float(words[1]), float(words[3])
def bootstrap(compiler, runs=10):
compile_cpp(compiler)
times_pow = []
times_mul = []
for i in range(runs):
time_pow, time_mul = get_results()
times_pow.append(time_pow)
times_mul.append(time_mul)
mean_pow = numpy.mean(times_pow)
mean_mul = numpy.mean(times_mul)
std_pow = numpy.std(times_pow)
std_mul = numpy.std(times_mul)
return unitprint.siunitx(mean_pow, std_pow), \
unitprint.siunitx(mean_mul, std_mul)
def main():
for compiler in ['g++', 'clang++']:
print(compiler, *bootstrap(compiler))
if __name__ == "__main__":
main()
```

These are the results:

Compiler | `pow` / s |
`x * x` / s |
---|---|---|

g++ | 2.90 ± 0.03 | 1.28 ± 0.01 |

clang++ | 1.272 ± 0.008 | 1.268 ± 0.008 |

With `clang++`

, there is a tiny difference between `pow`

and
multiplication, it is not really significant though, since it is just half a
standard deviation. `g++`

takes more than twice as long for using `pow`

. I
can absolutely understand that people using `g++`

will try to avoid the
`pow`

function.

However, and that is my point, the statement that `pow`

is *always* slower
than multiplication does not really hold. I consider the results from
`clang++`

to be on par. Please test your application on your compiler and see
what is faster in reality, not in theory.

## Source of `pow`

¶

And if you look at the source
of `pow()`

, you will see that they have thought about it:

```
112 /* First see whether `y' is a natural number. In this case we
113 can use a more precise algorithm. */
```

Then it jumps to `9`

where it says:

```
136 9: /* OK, we have an integer value for y. Unless very small
137 (we use < 8), use the algorithm for real exponent to avoid
138 accumulation of errors. */
```