Moving the integer arithmetic to gmpy2 yields a modest speedup for (e, 10^6), shaving off less than 8% of the run time.
$ python3 -m timeit -s 'import numsci.sbtree as mod' 'mod.work_speclimit (mod.fsb.Specs.E (), 1000000)'
100 loops, best of 3: 4.17 msec per loop
The gains should increase for higher stream positions as multiplications take over the bulk of the run time.
class Matrix:
z0 = gmpy2.mpz (0)
z1 = gmpy2.mpz (1)
I = [z1, z0, z0, z1]
S = [z0, z1, z1, z0]
L = [z1, z1, z0, z1]
R = [z1, z0, z1, z1]