Next: Syntax Classes, Previous: Emacs-Only Features, Up: Regular Expressions [Index]
Regular expressions have a syntax in which a few characters are “special constructs” and the rest are “ordinary”.
The “special characters” are:
Any other character appearing in a regular expression is ordinary, unless a ‘\’ precedes it.
Things to note:
:indic code
For historical compatibility reasons, ‘^’ can be used only at the beginning of the regular expression, or after ‘\(’, ‘\(?:’ or ‘\|’.
For historical compatibility reasons, ‘$’ can be used only at the end of the regular expression, or before ‘\)’ or ‘\|’.
also has special meaning in the read syntax of Lisp strings and must be quoted with ‘\’.
\\ => \ \\\\ => \\
‘[ ... ]’ is a character alternative.
[ad]
[a-z]
[a-z$%.]
[]a-z]
To include a ‘]’ in a character alternative, you must make it the first character.
[]a-z-]
To include a ‘-’, write ‘-’ as the first or last character of the character alternative, or as the upper bound of a range.
^
To include ‘^’ in a character alternative, put it anywhere but at the beginning.
[^…]
‘[^’ begins a “complemented character alternative”. This matches any character except the ones specified. ‘^’ is not special in a character alternative unless it is the first character. A complemented character alternative can match a newline, unless newline is mentioned as one of the characters not to match. This is in contrast to the handling of regexps in programs such as ‘grep’.
The exact rules are that:
The following aspects of ranges are specific to Emacs, in that POSIX allows but does not require this behavior and programs other than Emacs may behave differently:
case-fold-search
is non-‘nil’, ‘[a-z]’ also matches upper-case letters.
Next: Syntax Classes, Previous: Emacs-Only Features, Up: Regular Expressions [Index]