[ prog / sol / mona ]

prog


What are you working on?

158 2020-07-20 02:50

As pointed out at the end of >>141 irregex-fold/chunked/fast >>135 has the bugs of irregex-fold/fast and a few more.
- Because of its placement kons doesn't see every match.
- Because of its placement irregex-reset-matches! is not called on every round.
- Because of its placement the ~consumer? test can miss rounds.
- The proper order on the match branch is: kons -> reset-matches! -> ~consumer? -> empty -> else, as used in irregex-fold/fast >>135.
- The empty match test is bogus. It compares the match end index with the search start index instead of the match start index.
- The 'next' chunk is not checked for existence, so the start index of a non-existent chunk may be requested from the chunker. Depending on the chunker implementation this may abort.

The last point can be demonstrated on the rope chunker from the documentation, which is not part of the library proper but is present in the test suite.
http://synthcode.com/scheme/irregex/#SECTION_3.4
https://github.com/ashinn/irregex/blob/ac27338c5b490d19624c30d787c78bbfa45e1f11/test-irregex.scm#L122

$ cat attack.scm 
(define (rope . args)
  (map (lambda (x) (if (pair? x) x (list x 0 (string-length x)))) args))

(define rope-chunker
  (make-irregex-chunker
   (lambda (x) (and (pair? (cdr x)) (cdr x)))
   caar
   cadar
   caddar
   (lambda (src1 i src2 j)
     (if (eq? src1 src2)
         (substring (caar src1) i j)
         (let lp ((src (cdr src1))
                  (res (list (substring (caar src1) i (caddar src1)))))
           (if (eq? src src2)
               (string-join
                (reverse (cons (substring (caar src2) (cadar src2) j) res))
                "")
               (lp (cdr src)
                   (cons (substring (caar src) (cadar src) (caddar src))
                         res))))))))

$ guile --no-auto-compile -l irregex.scm -l attack.scm 
GNU Guile 2.2.3
[...]
scheme@(guile-user)> (irregex-fold/chunked '(or "a\n" bol) (lambda (start i m acc) acc) '() rope-chunker (rope "a\n"))
ERROR: In procedure cadar:
In procedure cadar: Wrong type (expecting pair): #f

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
scheme@(guile-user) [1]> 

So this is another attack vector for irregex. Evidently the 'next' chunk must be checked for existence.

199


VIP:

do not edit these