Character Escapes |
\t |
tab character |
\r |
carriage-return character |
\n |
line-feed (or newline) character |
\x20 |
ASCII character expressed in hex notation (\x20 is a space) |
\ |
the escape character (\? matches question mark, \. matches dot, and \\ matches backslash) |
Character Classes |
. |
Any character (except the newline character if you use the Multiline option) |
[aeiou] |
Any character between square brackets, [aeiou] matches a lowercase vowel; [0-9] matches a digit |
[^0-9] |
Any character except those between square brackets, [^0-9] matches any nondigit |
[A-Z&&[^P]] (* not .net) |
Any character from A to Z except P |
\w |
A word character—same as [A-Za-z_0-9] |
\d |
A digit—same as [0-9] |
\s |
A white-space character (a space, tab, form-feed, newline, or carriage return] |
\W, \D, \S |
The uppercase version negates the lowercase symbol; \W means any nonword character, \D matches any nondigit, and \S is a non-white-space |
\Q to \E (* not .net) |
Can place literals in between |
Atomic zero-width assertions (match must appear in the source string, yet don't match any character) |
^ |
The beginning of the string, or the beginning of the line if in multiline mode |
$ |
The end of the string, or the end of the line if in multiline mode |
\b |
The word boundary—the position between a \w and a \W character |
\B |
Not a word boundary |
Quantifiers |
* |
Zero or more matches—same as {0,} |
+ |
One or more matches—same as {1,} |
? |
Zero or one matches—same as {0,1} (or appended to other iteration operators means don't be greedy, but match as few characters as possible) |
{N} |
Exactly N matches |
{N,} |
N or more matches |
{N,M} |
Between N and M matches |
Grouping and alternating constructs |
(expr) |
Captures the matched expression and assigns it a number depending on its position (/0 always refers to the entire expression match) |
(expr)+ |
Captures one or more iterations of the matched expression, like (\.\w)+ for domain names |
(?<name>expr) |
Captures the matched expression and assigns it a name |
(expr1|expr2) |
Matches any of the expressions enclosed in parentheses |
(?=expr) |
Zero-width positive look-ahead assertion—continues the matching only if the expression matches at this position, but doesn't include the expression in the match. For example, \w+(?=,) matches a word followed by a comma, without also matching the comma, or (?=Patrick)Pat matches Pat, but only if part of wors Patrick (use ?<= for lookbehind) |
(?!expr) |
Zero-width negative look-ahead assertion—continues the matching only if the expression doesn't match at this position. For example, \w+(?!\s) matches a word that isn't followed by a white space (use ?<! for negative lookbehind) |
Back-reference and substitution constructs |
\N |
Back-reference to group number N |
\k<name> |
Back-reference to a previous named group |
$N |
Substitutes the last substring matched with group number N |
${name} |
Substitutes the last substring matched with a group with a given name |
Special Options |
?i |
Ignore case |
?m |
Multiline mode |
?d |
UNIX line terminators |