[1/3]
Here is the corrected version of >>67. As explained above it uses partial S-groups when switching S-group order and eliminates the lower loops. The runtime is now properly O(n^(1/8)), as backed by >>75.
def work_seq3_cheat (start, count, nleft, coverfun):
wantlow = coverfun (start, 1)
if wantlow > nleft:
return (0, 0)
low, high = 1, count
diff = high - low
while diff > 1:
now = (low + high) // 2
want = coverfun (start, now)
if want <= nleft:
low = now
wantlow = want
else:
high = now
diff = high - low
return (low, wantlow)
def work_seq4 (n):
seq3covered = seq3_covered
seq3inc = seq3_inc
seq4next = seq2_next
seq3sum1sc = seq3_sum1sc
seq4covered = seq4_covered
seq4inc = seq4_inc
levels = []
R, S0, S1 = 3, 4, 4
start, count = 4, 3
advance = seq4covered (start, count, S1)
covered = 2 + advance
while covered <= n:
incR, incS0, incS1 = seq4inc (start, count, S0, S1, advance)
R += incR
S0 += incS0
S1 += incS1
start, count = seq4next (levels, 0)
advance = seq4covered (start, count, S1)
covered += advance
print ("levels:", len (levels))
covered -= advance
used, advance = work_seq3_cheat (
start, count, n - covered,
lambda s, c: seq4covered (s, c, S1))
if used > 0:
incR, incS0, incS1 = seq4inc (start, used, S0, S1, advance)
R += incR
S0 += incS0
S1 += incS1
covered += advance
fakelevels = [[start, count, used, S1]]
start, count = seq4next (fakelevels, 0)
used, advance = work_seq3_cheat (
start, count, n - covered, seq3covered)
if used > 0:
incR, incS0 = seq3inc (start, used, S0)
R += incR
S0 += incS0
covered += advance
fakelevels = [[start, count, used, S0]]
start, count = seq4next (fakelevels, 0)
nleft = n - covered
if nleft > 0:
R += seq3sum1sc (start, nleft)
return R
Here are the updated timings, all in vanilla python.
- The u64 overflow R(10^10) is computed in under 100 microseconds.
- The u128 overflow R(10^20) is computed in under 2 milliseconds.
- R(10^30) is computed in under 40 milliseconds.
- The u256 overflow R(10^39) is computed in just over 500 milliseconds.
$ python3 -m timeit -s 'import flake.hofs as mod' 'mod.work_seq4 (10000000000)'
10000 loops, best of 3: 96.6 usec per loop
$ python3 -m timeit -s 'import flake.hofs as mod' 'mod.work_seq4 (100000000000000000000)'
1000 loops, best of 3: 1.81 msec per loop
$ python3 -m timeit -s 'import flake.hofs as mod' 'mod.work_seq4 (1000000000000000000000000000000)'
10 loops, best of 3: 35.9 msec per loop
$ python3 -m timeit -s 'import flake.hofs as mod' 'mod.work_seq4 (1000000000000000000000000000000000000000)'
10 loops, best of 3: 527 msec per loop
R(10^39) is a good value for comparing implementations with this order of growth. Here is the k<=60 table in >>38 format.