[3/5]
However, I promised the cold, hard truth and this was only the cold part, so here is the rest. The disregard for "leftmost, longest" in sre->procedure is not limited to (or) and (**), it is present in all branches that can take multiple paths. This is meant to be salvaged by the external user of sre->procedure via the 'fail' lambda that is returned as part of the 'matches' object in the named let lp. On a successful match, 'fail' is actually the retry continuation. This is how irregex-match works. Its driver for sre->procedure is the 'else' branch of irregex-match/chunked:
(define (irregex-match/chunked irx cnk src)
(let* ((irx (irregex irx))
(matches (irregex-new-matches irx)))
(irregex-match-chunker-set! matches cnk)
(cond
((irregex-dfa irx)
[...]
(else
(let* ((matcher (irregex-nfa irx))
(str ((chunker-get-str cnk) src))
(i ((chunker-get-start cnk) src))
(end ((chunker-get-end cnk) src))
(init (cons src i)))
(let lp ((m (matcher cnk init src str i end matches (lambda () #f))))
(and m
(cond
((and (not ((chunker-get-next cnk)
(%irregex-match-end-chunk m 0)))
(= ((chunker-get-end cnk)
(%irregex-match-end-chunk m 0))
(%irregex-match-end-index m 0)))
(%irregex-match-fail-set! m #f)
m)
((%irregex-match-fail m)
(lp ((%irregex-match-fail m))))
(else
#f)))))))))
Whenever there is a match that does not exhaust the input, and a retry continuation exists, the retry is called by the "(lp ((%irregex-match-fail m)))" branch. This means that if a full match is possible, it will be found. Here is the above (**) example with irregex-match and one more debug print:
scheme@(guile-user)> (define (imsim re str) (irregex-match-substring (irregex-match re str)))
scheme@(guile-user)> (imsim '(** 3 4 (or "a" "ab")) "abababab")
trying a at 0
trying a at 1
trying ab at 1
trying ab at 0
trying a at 2
trying a at 3
trying ab at 3
trying ab at 2
trying a at 4
trying a at 5
trying ab at 5
retry by irregex-match/chunked
trying ab at 4
trying a at 6
retry by irregex-match/chunked
trying ab at 6
$1 = "abababab"
scheme@(guile-user)>
Irregex-match/chunked has to override sre->procedure's result twice to get the full match.