-
Notifications
You must be signed in to change notification settings - Fork 335
Description
In the readme for the Fibonacci benchmark, we see the following requirements for different implementations of the benchmark:
ALL IMPLEMENTAITONS MUST...
Have a function that recursively compute a fibonacci number with this naive algorithm
Base case for input 0
Base case for input 1
Must make two recursive calls for each non-base invocation
No result caching, conversion to tail recursion, or iterative solutions.
This is because the benchmark Emphasizes function call overhead, stack pushing / popping, and recursion.
This is misleading. The recursive calls in the reference implementation are tail calls, and gcc at optimization levels higher than -O1 does optimize this by eliminating one of function calls. Thus the compiler produces assembly that is roughly equivalent to the following c code:
int32_t fibonacci(int32_t n) {
int32_t result = 0;
while (n > 1) {
result += fibonacci(n - 1);
n -= 2;
}
if (n == 1) result += 1;
return result;
}The Fortran compiler gfortran goes even further. It "unrolls" the recursive calls, and produces assembly that is roughly equivalent to the following c code:
int32_t fibonacci(int32_t n) {
if (n == 0) return 0;
if (n == 1) return 1;
if (n == 2) return 1;
if (n == 3) return 2;
int32_t a = fibonacci(n - 3);
int32_t b = fibonacci(n - 4);
if (n == 4) return a + 2;
a += b;
int32_t c = fibonacci(n - 5);
b += c;
a += b;
if (n == 5) return b + 1;
return a + b + c + fibonacci(n - 6);
}(See the output on Compiler Explorer)
If tail calls are not optimized, then the reference implementation requires 331160280 function calls to calculate fibonaaci(40), while the fortran version only requires 313883 function calls. This is calculated as follows:
- If
T_nis the number of function calls that a naive recursive solution requires to calculate the nth Fibonacci number, then we have thatT_0 = T_1 = 0, andT_{n + 2} = 2 + T_{n + 1} + T_n. - If
U_nis the number of function calls that thefortranoptimization requires to calculate the nth Fibonacci number, then we have thatU_0 = U_1 = U_2 = U_3 = 0,U_4 = 2,U_5 = 3, andU_{n + 6} = 4 + U_n + U_{n + 1} + U_{n + 2} + U_{n + 3}.
Thus, contrary to its stated goal, using this benchmark to compare c and fortran does not give us an accurate comparison of the overhead that each language introduces when making recursive function calls. Instead the difference in performance is almost entirely a result of the compilers' ability to remove the function calls altogether.
At the very least, the readme should clarify that while the source code for each implementation is not allowed to eliminate one of the recursive calls, and must make exactly two recursive calls on each iteration, and that these recursive calls must calculate the (n - 1)th and (n - 2)th Fibonacci numbers, the compiler may still make whatever optimizations it is capable of.