But your stress test is on range calculation, mine is on the requests/second. Generating HTML is what takes long and eats memory, not generating numerical posts range. And by a huge factor.
All of this is true, just as it is also true that if delete-duplicates is allowed to abort on memory limit the code won't even get to the sxml phase, let alone sxml->html conversion.
Since you have a deployed instance, do you have a profiled run of a request so we can see which parts of the scheme code take the most time in the full request serving pipeline?
And if you are considering 1.0, please replace lib/markup.scm:line-scanner because that amount of copypasting hurts my eyes. Before:
(define (line-scanner l)
(let ((b (partial lines->sxml bold))
(i (partial lines->sxml italic))
(tt (partial lines->sxml code))
(ql (partial lines->sxml quotelink))
(a (partial lines->sxml link))
(spoiler (partial lines->sxml del)))
((compose spoiler tt a ql b i) l)))
After:
(define line-scanner-order (list
del code link quotelink bold italic))
(define (line-scanner l)
((apply compose (map (lambda (tr) (partial lines->sxml tr)) line-scanner-order)) l))