Wolfram ResearchPRODUCTSPURCHASEFOR USERSCOMPANYOUR SITES
THIS IS DOCUMENTATION FOR AN OBSOLETE PRODUCT.
SEE THE DOCUMENTATION CENTER FOR THE LATEST INFORMATION.

2. General String Patterns

A general string pattern is formed from pattern objects similar to the general pattern objects in Mathematica. To join several string pattern objects, use the StringExpression operator ~~ .

In[10]:= 

Out[10]//FullForm=

StringExpression is closely related to StringJoin, except nonstrings are allowed and lists are not flattened. For pure strings, they are equivalent.

In[11]:= 

Out[11]=

The list of objects that can appear in a string pattern closely matches the list for ordinary Mathematica patterns. In terms of string patterns, a string is considered a sequence of characters, that is, "abc" can be thought of as something like String[a, b, c], to which the ordinary pattern constructs apply.

The following objects can appear in a symbolic string pattern:

The following represent classes of characters:

The following represent positions in strings:

The following determine which match will be used if there are several possibilities:

Some nontrivial issues regarding these objects follow.

The _, __, and ___ wildcards match any characters including newlines. To match any character except newline (analogous to the "." in regular expressions), use Except["\n"], Except["\n"].., and Except["\n"]... .

In[12]:= 

Out[12]=

In[13]:= 

Out[13]=

In[14]:= 

Out[14]=

A list of patterns, such as {"a","b","c"} is equivalent to a list of alternatives, such as "a"|"b"|"c". This is convenient in that functions like Characters and CharacterRange can be used to specify classes of characters.

In[15]:= 

Out[15]=

When Condition (/;) is used, the patterns involved are treated as strings as far as the rest of Mathematica is concerned, so you need to use ToExpression in some cases.

In[16]:= 

Out[16]=

Similar to ordinary Mathematica patterns, the function in PatternTest (?) is applied to each individual character.

In[17]:= 

Out[17]=

The Whitespace construct is equivalent to WhitespaceCharacter.. .

In[18]:= 

Out[18]=

You can insert a RegularExpression object into a general string pattern.

In[19]:= 

Out[19]=

This inserts a lookbehind constraint (see Regular Expressions) to ensure that you only pick words preceded by "the ".

In[20]:= 

Out[20]=

StringExpression objects can be nested.

In[21]:= 

Out[21]=

The Except construct for string patterns takes a single argument that should represent a single character or a class of single characters.

This deletes all nonvowel characters from the string.

In[22]:= 

Out[22]=

When trying to match patterns of variable length (such as __ and patt..), the longest possible match is tried first by default. To force the matcher to try the shortest match first, you can wrap the relevant part of the pattern in ShortestMatch[ ].

In[23]:= 

Out[23]=

In[24]:= 

Out[24]=

If for some reason you need a longest match within the short match, you can use LongestMatch.

In[25]:= 

Out[25]=

In[26]:= 

Out[26]=

You could alternatively rewrite this pattern without use of LongestMatch.

In[27]:= 

Out[27]=


Any questions about topics on this page? Click here to get an individual response.Buy NowFree TrialMore Information



 © 2009 Wolfram Research, Inc.  Terms of Use  Privacy Policy |
Sign up for our newsletter: