Literals
Any character not a meta-character of the regex syntax is matched literally. This is true for any character, even a space is significant unless you also specify the free form mode modifier. If you need to match literally a meta-character, you can precede it by the backslash (\) meta-character.
For regex stored in string, you should also be careful about the meta-characters specifics to literal string in the host language. To match a backslash in a regex, you need two backlashes (/\\/) 1.
Meta-character used to match special characters
For convenience, most implementation also define character shorthands for useful control characters:
Shorthand | Mnemonic | ASCII code | Matches |
---|---|---|---|
\a | Alert | 0x07 | Ctrl-G, the BEL control character |
\b | Backspace | 0x08 | Ctrl-H, the BS control character. The meaning of this shorthand is usually restricted to character classes. Outside character classes, \b means word-boundaries |
\e | Escape | 0x1B | Ctrl-[, the ESC control character |
\f | Form feed | 0x0C | the FF control character |
\n | Newline | 0x0A | usualy the LF control character, except for MacOS systems that match ASCII 0x0C |
\r | Carriage return | 0x0C | usualy the CR control character, except for MacOS systems that match ASCII 0x0A |
\t | Tabulation | 0x09 | the HT control character |
\v | Vertical tabulation | 0x0B | the HT control character |
\nnn | Octal code | any character represented as an octal value, ie: \012 and \n are equivalent. The number of digit following \ is usually fixed to 3. | |
\x{nn} or \x{nnn} | Hexadecimal code | any character represented as an hexadecimal value, ie: \x0A and \n are equivalent. The number of digit following \x is usually fixed to 2 or 4. | |
\unnnn | Unicode character | any unicode character represented as an hexadecimal value, ie: \x000A and \n are equivalent. The number of digit following \u is usually fixed to 4 (or 8). | |
\cchar | Control character | any control character represented by its corresponding control letter, ie: \cJ and \n are equivalent. |
- ^ If the regex is written as a literal string of a host language that use backslashes has a meta-character, you need four backslashes ("\\\\").