The third layer of >>101 has been ported to GMP >>145. The same order of growth in python is in >>76 with the trick in >>80 applied, and it computed the u256 overflow R(10^39) in just over 400 milliseconds. The time in C+GMP is 85 milliseconds, so a speedup by a factor of almost 5.
$ time bin/gmphofs "1$(printf "%039d" 0)"
levels: 5
500000000000000000029814239694953305682145634502486118475654808671411192582303 259
real 0m0,085s
user 0m0,084s
sys 0m0,001s
$ python3
[...]
>>> 417 / 85
4.905882352941177
>>>
Since the speedup factor decreased, It seems that the weight of the increasing number of multiplications takes precedence over eliminating an increasing number of allocations. Time to port the fourth layer.
>>147
Reality, it seems, chooses to mold itself to your hopes on this occasion.