4.1.2 Scanning

Function: create-scanner (re-string string) &key case-insensitive-mode multi-line-mode single-line-mode extended-mode destructive

The function accepts most of the regex syntax of Perl 5.8 as described in man perlre including extended features like non-greedy repetitions, positive and negative look-ahead and look-behind assertions, "standalone" subexpressions, and conditional subpatterns.

The following Perl features are currently supported:

  • \t’, ‘\n’, ‘\r’, ‘\f’, ‘\a’, ‘\e’,
  • \033’ (octal character codes),
  • \x1B’ (hexadecimal character codes),
  • \c[’ (control characters),
  • \w’, ‘\W’, ‘\s’, ‘\S’, ‘\d’, ‘\D’, ‘\b’, ‘\B’, ‘\A’, ‘\Z’, and ‘\z
  • \Q’ and ‘\E
  • \p’ and ‘\P’ (named properties) but only the long form with braces is supported, i.e. ‘\p{Letter}’ and ‘\p{L}’ will work while ‘\pL’ won’t.

The following Perl features are (currently) not supported:

  • (?{ code }) and (??{ code }) because they obviously don’t make sense in Lisp.
  • \N{name}’ (named characters),
  • \x{263a}’ (wide hex characters),
  • \l’, ‘\u’, ‘\L’, and ‘\U’ because they’re actually not part of Perl’s regex syntax
  • \X’ (extended Unicode), and
  • \C’ (single character).
  • Posix character classes like ‘[[:alpha]]’. Use Unicode properties instead.
  • \G’ for Perl’s pos() because we don’t have it.
return values

scanner, register-names


Accepts a string which is a regular expression in Perl syntax and returns a closure which will scan strings for this regular expression.


The second value is only returned if ‘*ALLOW-NAMED-REGISTERS*’ is ‘true’.


(return value) represents a list of strings mapping registers to their respective names; the first element stands for first register, the second element for second register, etc. You have to store this value if you want to map a register number to its name later as scanner doesn’t capture any information about register names. If a register isn’t named, it has ‘NIL’ as its name.


The ‘mode’ keyword arguments are equivalent to the ‘imsx’ modifiers in Perl. The ‘destructive’ keyword will be ignored.

Parse Tree

Function: create-scanner (parse-tree t) &key case-insensitive-mode multi-line-mode single-line-mode extended-mode destructive

This is similar to CREATE-SCANNER for regex strings above but accepts a parse tree as its first argument. A parse tree is an S-expression conforming to the following syntax:…

return values

scanner, register-names


Function: scan regex target-string &key start end

Searches the string target-string from start (which defaults to 0) to end (which defaults to the length of target-string) and tries to match regex. On success returns four values:

  • the start of the match,
  • the end of the match
  • array denoting the beginnings of register matches
  • array denoting the endings of register matches

On failure returns ‘NIL’.

  • match-start,
  • match-end,
  • reg-starts,
  • reg-ends


Function: scan-to-strings regex target-string &key start end sharedp
return values

match, regs

Like SCAN but returns substrings of target-string instead of positions, i.e. this function returns two values on success:

  • the whole match as a string
  • plus an array of substrings (or =NIL=s) corresponding to the matched registers. If sharedp is true, the substrings may share structure with target-string.

