RegularExpression
Usage
• RegularExpression["regex"] represents the generalized regular expression specified by the string "regex".
Notes
• RegularExpression supports standard regular expression syntax, of the kind used in typical string manipulation languages. • The following basic elements can be used in regular expression strings:
| c | the literal character c | | . | any character except newline | [ ... ] | any of the characters | [ - ] | any character in the range - | [^ ... ] | any character except the | | p* | p repeated zero or more times | | p+ | p repeated one or more times | | p? | zero or one occurrence of p | | p{m,n} | p repeated between m and n times | | p*?, p+?, p?? | the shortest consistent strings that match | ( ... ) | strings matching the sequence , , ... | | | strings matching or |
• The following represent classes of characters:
| \\d | digit 0-9 | | \\D | non-digit | | \\s | space, newline, tab or other whitespace character | | \\S | non-whitespace character | | \\w | word character (letter, digit or _) | | \\W | non-word character | | [[:class:]] | characters in a named class | | [^[:class:]] | characters not in a named class |
• The following named classes can be used: alnum, alpha, ascii, blank, cntrl, digit, graph, lower, print, punct, space, upper, word, xdigit. • The following represent positions in strings:
| ^ | the beginning of the string (or line) | | $ | the end of the string (or line) | | \\b | word boundary | | \\B | anywhere except a word boundary |
• The following set options for all regular expression elements that follow them:
| (?i) | treat upper and lower case as equivalent (ignore case) | | (?m) | make ^ and $ match start and end of lines (multiline mode) | | (?s) | allow . to match newline | | (?-\#c) | unset options |
• \\., \\[, etc. represent literal characters ., [, etc. • Analogs of named Mathematica patterns such as x:expr can be set up in regular expression strings using (regex). • Within a regular expression string, \\ n represents the substring matched by the n parenthesized regular expression object (regex). • For the purpose of functions such as StringReplace and StringCases, any $n appearing in the right-hand side of a rule RegularExpression["regex"] -> rhs is taken to correspond to the substring matched by the n parenthesized regular expression object in regex. • New in Version 5.1.
|