2.12.9 Searching Files
| FindList["file", "text"] | get a list of all the lines in the file that contain the specified text | | FindList["file", "text", n] | get a list of the first n lines that contain the specified text |
FindList["file", {" ", " ", ... }]
| | get lines that contain any of the |
Finding lines that contain specified text. | Here is a file containing some text. | |
 |
| This returns a list of all the lines in the file containing the text is. | |
In[2]:=
FindList["textfile", "is"]
|
Out[2]=
|
|
| The text fourth appears nowhere in the file. | |
In[3]:=
FindList["textfile", "fourth"]
|
Out[3]=
|
|
By default, FindList scans successive lines of a file, and returns those lines which contain the text you specify. In general, however, you can get FindList to scan successive records, and return complete records which contain specified text. As in ReadList, the option RecordSeparators allows you to tell Mathematica what strings you want to consider as record separators. Note that by giving a pair of lists as the setting for RecordSeparators, you can specify different left and right separators. By doing this, you can make FindList search only for text which is between specific pairs of separators. | This finds all "sentences" ending with a period which contain And. | |
In[4]:=
FindList["textfile", "And", RecordSeparators -> {"."}]
|
Out[4]=
|
|
| option name | default value | | | RecordSeparators | {"\n"} | separators for records | | AnchoredSearch | False | whether to require the text searched for to be at the beginning of a record | | WordSeparators | {" ", "\t"} | separators for words | | WordSearch | False | whether to require that the text searched for appear as a word | | IgnoreCase | False | whether to treat lower- and upper-case letters as equivalent |
Options for FindList. | This finds only the occurrence of Here which is at the beginning of a line in the file. | |
In[5]:=
FindList["textfile", "Here", AnchoredSearch -> True]
|
Out[5]=
|
|
In general, FindList finds text that appears anywhere inside a record. By setting the option WordSearch -> True, however, you can tell FindList to require that the text it is looking for appears as a separate word in the record. The option WordSeparators specifies the list of separators for words. | The text th does appear in the file, but not as a word. As a result, the FindList fails. | |
In[6]:=
FindList["textfile", "th", WordSearch -> True]
|
Out[6]=
|
|
FindList[{" ", " ", ... }, "text"]
| | search for occurrences of the text in any of the |
Searching in multiple files. | This searches for third in two copies of textfile. | |
In[7]:=
FindList[{"textfile", "textfile"}, "third"]
|
Out[7]=
|
|
It is often useful to call FindList on lists of files generated by functions such as FileNames.
| FindList["!command", ... ] | run an external command, and find text in its output |
Finding text in the output from an external program. | This runs the external Unix command date. | |

Out[8]=
|
|
| This finds the time-of-day field in the date. | |
In[9]:=
FindList["!date", ":", RecordSeparators -> {" "}]
|
Out[9]=
|
|
| OpenRead["file"] | open a file for reading | | OpenRead["!command"] | open a pipe for reading | | Find[stream, text] | find the next occurrence of text | | Close[stream] | close an input stream |
Finding successive occurrences of text. FindList works by making one pass through a particular file, looking for occurrences of the text you specify. Sometimes, however, you may want to search incrementally for successive occurrences of a piece of text. You can do this using Find. In order to use Find, you first explicitly have to open an input stream using OpenRead. Then, every time you call Find on this stream, it will search for the text you specify, and make the current point in the file be just after the record it finds. As a result, you can call Find several times to find successive pieces of text. | This opens an input stream for textfile. | |
In[10]:=
stext = OpenRead["textfile"]
|
Out[10]=
|
|
| This finds the first line containing And. | |
In[11]:=
Find[stext, "And"]
|
Out[11]=
|
|
| Calling Find again gives you the next line containing And. | |
In[12]:=
Find[stext, "And"]
|
Out[12]=
|
|
| This closes the input stream. | |
Out[13]=
|
|
Once you have an input stream, you can mix calls to Find, Skip and Read. If you ever call FindList or ReadList, Mathematica will immediately read to the end of the input stream. | This opens the input stream. | |
In[14]:=
stext = OpenRead["textfile"]
|
Out[14]=
|
|
| This finds the first line which contains second, and leaves the current point in the file at the beginning of the next line. | |
In[15]:=
Find[stext, "second"]
|
Out[15]=
|
|
| Read can then read the word that appears at the beginning of the line. | |
In[16]:=
Read[stext, Word]
|
Out[16]=
|
|
| This skips over the next three words. | |
In[17]:=
Skip[stext, Word, 3]
|
|
| Mathematica finds is in the remaining text, and prints the entire record as output. | |
In[18]:=
Find[stext, "is"]
|
Out[18]=
|
|
| This closes the input stream. | |
Out[19]=
|
|
| StreamPosition[stream] | find the position of the current point in an open stream | | SetStreamPosition[stream, n] | set the position of the current point | | SetStreamPosition[stream, 0] | set the current point to the beginning of a stream |
SetStreamPosition[stream, Infinity]
| | set the current point to the end of a stream |
Finding and setting the current point in a stream. Functions like Read, Skip and Find usually operate on streams in an entirely sequential fashion. Each time one of the functions is called, the current point in the stream moves on. Sometimes, you may need to know where the current point in a stream is, and be able to reset it. On most computer systems, StreamPosition returns the position of the current point as an integer giving the number of bytes from the beginning of the stream. | This opens the stream. | |
In[20]:=
stext = OpenRead["textfile"]
|
Out[20]=
|
|
| When you first open the file, the current point is at the beginning, and StreamPosition returns 0. | |
In[21]:=
StreamPosition[stext]
|
Out[21]=
|
|
| This reads the first line in the file. | |
In[22]:=
Read[stext, Record]
|
Out[22]=
|
|
| Now the current point has advanced. | |
In[23]:=
StreamPosition[stext]
|
Out[23]=
|
|
| This sets the stream position back. | |
In[24]:=
SetStreamPosition[stext, 5]
|
Out[24]=
|
|
| Now Read returns the remainder of the first line. | |
In[25]:=
Read[stext, Record]
|
Out[25]=
|
|
| This closes the stream. | |
Out[26]=
|
|
|