Precomputing and storing all the integer constants from Layer4.increment >>102 as mpz values actually slows things down >>137. There are 326 of them. Attribute lookup is so slow in CPython that it is faster to let the same mpz constants be converted anew from python ints on every invocation.
$ python3 -m cProfile -s time wrapper.py numsci.hofs l4 "1$(printf "%078d" 0)" | tee temp/profile.txt | less
levels: 5
1000000000000000000000000000000000000000000000000000000000000000000000000000000 -> 500000000000000000000000000000000000000942809041582063365839428238027648024522282176747574937498915190189902751724515553262387695706471431607983773634306949 0x254AAD173EA5E82CB765169D1EFC9860FE05AE7B1E8FB19314D597561CED1F19F448CA74BCDEE10F0F168DAE0F529DA97AADE66B397A970EDCFEA83CA35C68A385 518
2583269 function calls (2582676 primitive calls) in 5.997 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
143277 4.762 0.000 4.886 0.000 hofs.py:394(increment)
143296 0.282 0.000 0.282 0.000 hofs.py:794(increment)
1 0.183 0.183 5.979 5.979 hofs.py:909(work_layers_loop)
143397 0.153 0.000 0.757 0.000 hofs.py:869(counts)
143331 0.115 0.000 0.115 0.000 hofs.py:771(increment)
859989 0.088 0.000 0.088 0.000 {built-in method gmpy2.mpz}
This is also part of why the method caching in LayeredGroups.GroupN >>104 pays off.
instead of computing the triangular number [...] subtract the values of R that have been seen
This idea was also expressed by >>33 and is used in the OEIS C version >>42.