1. Home
  2. Docs
  3. Beginning UNIX on Raven
  4. Working with text files
  5. Searching within text files

Searching within text files

The grep command is a simple way of looking for data within a file and returning lines that match your search term.  This search term can either be an exact match or can include wildcards and other modifiers to the search term to include a variety of matches.

If we want to find out how many times the word Chevy appears in the lyrics of American Pie, we can use the command:

grep Chevy americanpie.txt

This will return the 7 lines of our file that contain the word Chevy.  In its basic form, this is an exact match (case specific) search.  So a search for chevy will return no results.

grep chevy americanpie.txt

We can modify the command to be case insensitive using the –i flag which will again return us our 7 lines of lyric.

grep -i chevy americanpie.txt

All of this talk of lyrics now may just get you in the mood for some ‘singin’.  We can check how much ‘signin’ is occurring using the command:

grep -i singin americanpie.txt

But this returns all sort of ‘singin’.  Lets say we are only interested in the singing about our impending demise we need to modify our search to only include ‘singin’ at the start of a line.  This is achieved by use of the circumflex (Chinese hat) character.

grep -i ^singin americanpie.txt

If we undertake a search for the word ‘down’, we get two results:

grep -i down americanpie.txt

If we just want to return the line concerning the gaze of the king, we can just append a $ character which indicates the end of a line:

grep -i down$ americanpie.txt

All these lyrics and ‘singin’ can only lead to one thing, a tapping foot and an sudden interest in dance and dancing.  We can undertake a search for lines about dance and/or dancing using the * wildcard character, which you will recall searches for zero or more string characters:

grep -i danc* americanpie.txt

Finally we might have an interest in lines which are either dry or contain whiskey.  To undertake an either or search we need put the grep command into a different mode using the –P flag (which enables Perl based searching), bound our search in apostrophe characters and finally use the | character to separate our search terms.

grep -P ‘dry|whiskey' americanpie.txt