[3/3]
This fix is only for the four issues above while staying within the rules inferred from the existing code, not a complete rewrite with new rules. New rules are also possible but those are outside the scope of this fix. Here are the new timings:
1 ]=> (timeit (lambda () (string->sxml del (apply string-append (make-list 20 "a~~xx~~"))) 'ok))
.01 0. .01
;Value: ok
1 ]=> (timeit (lambda () (string->sxml del (apply string-append (make-list 30 "a~~xx~~"))) 'ok))
.01 0. .01
;Value: ok
1 ]=> (timeit (lambda () (string->sxml del (apply string-append (make-list 40 "a~~xx~~"))) 'ok))
.01 0. .011
;Value: ok
1 ]=> (timeit (lambda () (string->sxml del (apply string-append (make-list 50 "a~~xx~~"))) 'ok))
.01 0. .011
;Value: ok
1 ]=> (timeit (lambda () (string->sxml del (apply string-append (make-list 1000 "a~~xx~~"))) 'ok))
.12 0. .129
;Value: ok
The runtime for a count of 40 goes down from ten seconds to one hundredth of a second, for a speedup factor of roughly one thousand. The speedup factor increases with the loop count, and 40 is well below what fits within the original post size limit. The post-code spoilers >>27 work:
1 ]=> (string->sxml del "~~one~~ two ~~three~~ ==ab ~~cd~~ ef== gh ~~four~~ ij")
;Value 13: ((del "one") " two " (del "three") " ==ab ~~cd~~ ef==" " gh " (del "four") " ij")
The single-character content >>29 works:
1 ]=> (string->sxml del "~~M~~agneto~~H~~ydro~~D~~ynamics")
;Value 14: ((del "M") "agneto" (del "H") "ydro" (del "D") "ynamics")
And the two types of false positives >>30 are gone:
1 ]=> (string->sxml del "==code== ==code==~~spoiler~~==")
;Value 15: ("==code== ==code==" (del "spoiler") "==")
1 ]=> (string->sxml del "==one==, ~~two~~, ==three==")
;Value 16: ("==one==, " (del "two") ", ==three==")
The other two upgrades pending integration that others running their own instances may wish to apply are the performance enhancement using incremental HTML generation https://textboard.org/prog/39#t39p291 and the fix and additional checks >>21 for the string-split exploit >>20.