Previous Table of Contents Next


Module 40
egrep

DESCRIPTION

The external egrep command is used to search ASCII text files for a given text pattern. All lines containing the pattern are written to the standard output. The pattern may be a word or string of characters. Full regular expressions may be used inside the pattern to specify certain sequences of characters. If egrep searches more than one file, each displayed line is preceded by the filename. If no filenames are given, egrep reads from the standard input. Thus you can use egrep in a pipe command.

The name egrep is a combination of editor command characters. It is from the editor command :g/RE/p, which translates to global Regular Expression print. Inside the editor this command would search the entire file for all lines matching the RE pattern. The -v option of egrep is from the editor command :v/RE/p, which searches the entire file for all lines except the ones containing the RE pattern. The e added to the front of egrep stands for expression grep. That is a grep that can search for full regular expression type strings.

The egrep command adds the feature options of the fgrep command and extends the pattern matching capabilities of the grep command. You can use full regular expressions to search for matches. These expressions can be stored in a file or used on the command line. Thus egrep is the most powerful of the grep commands but also requires the most system resources and runs the slowest in most cases. For flexibility and pattern matching power you lose speed.


NOTE:  
The egrep command performs line oriented searches. Therefore, a phrase that spans more than one line is not matched. When deciding what string to use for the search, you should try to keep it as short as possible yet as unique as possible.



COMMAND FORMAT

Following is the general format of the egrep command.

     egrep [ -bchilnv ] [ -e pattern ] pattern file_list
     egrep [ -bchilnv ] [ -e pattern ] -f file file_list
BSD (Berkeley)
     egrep [ -bclnsv ] [ -e pattern ] pattern file_list
     egrep [ -bclnsv ] [ -e pattern ] -f file file_list

The form of the command uses the pattern provided on the command line to search the file_list for a match. In the second form multiple patterns are stored in file and egrep searches the file_list for each pattern listed.

Options

The following list describes the options and their arguments that may be used to control how egrep functions.

-b Displays the block number in which the pattern was found before the line that contains the matching pattern.
-c Displays only a total count of matching lines for each file processed.
-e -string This allows you to specify a string that begins with a dash. Normally, any argument beginning with a dash is interpreted as an option, not a string or argument.
-f file Read in the strings to search for from file. This allows you to create a file containing all of the strings you want egrep to search for in the file_list or standard input.
-h Suppress the displaying of filenames which precede lines that match the specified patterns when multiple files are searched.
-i Ignore the difference between uppercase and lowercase characters during comparisons.
-l Displays only the names of the files containing the specified pattern. The lines containing the patterns are not displayed.
-n Displays the line number before each line containing the pattern.
-v Displays only the lines that do not match the pattern. The v command in the ex editor performs the same type of function. It is an exception search. Search for every line except the ones containing the given pattern.

BSD (Berkeley)
-h Not supported.
-i Not supported.
-s Suppresses the displaying of diagnostic error messages for nonexistent and nonreadable files.

Arguments

The following list describes the arguments that may be passed to the egrep command.

pattern Any combination of characters, numbers, and regular expression patterns. egrep searches for strings of text that match the pattern expression.
file_list One or more files to search for the given pattern.

SEARCH PATTERNS (REGULAR EXPRESSIONS)

The following table contains each regular expression and the task that it performs when used inside a pattern. A regular expression consists of regular alphanumeric characters matching themselves and special characters matching certain patterns of text. Regular Expressions are often referred to as REs in UNIX terminology. Thus we use the RE notation for uniformity and brevity.


Alphanumeric
RE
Description

c Matches the character c.
string Matches the set of characters string.

Metacharacters

Metacharacters are the special characters used in regular expression patterns that have special meanings. Metacharacters are often referred to as special or magic characters or wild cards.


Special
RE
Description

\ Escapes the meaning of a metacharacter.
^ Matches the beginning of the line.
$ Matches the end of the line.
. Matches any single character.
[class] A character class. Matches any one character in the class.
[c1-c2] Match any one of the ASCII characters in the range defined within the brackets.
[^class] Do NOT match any of the ASCII characters listed within the brackets. Ranges may be specified.
| Alternation of regular expressions. Matches either one or the other of the regular expressions provided.
( ) Concatenation operation. Normally the parentheses are omitted. Allows for control of precedence in the interpretation of the regular expressions specified.
* Matches zero or more occurrences of the preceding regular expression.
+ Matches one or more occurrences of the preceding regular expression.
RE? Matches zero or one occurrences of the preceding regular expression.
RE\{m\} Matches exactly m occurrences of the preceding one-character RE.
RE\{m,\} Matches m or more occurrences of the preceding one-character RE.
RE\{m,n\} Matches m through n occurrences of the preceding one-character RE.

To have a metacharacter interpreted as a normal character, precede it with a backslash (\).

DIAGNOSTICS AND BUGS

The input lines read by egrep are limited to the system's BUFSIZ definition. If a line is longer than BUFSIZ characters, the line is truncated before the comparison is made. The BUFSIZ used by your system can be found in the /usr/include/stdio.h file.

The file containing regular expressions must not contain a blank line. If it does contain a blank line, egrep complains with a syntax error.

RELATED COMMANDS

Refer to the nawk, ed, ex, fgrep, grep, sed, and ksh commands described in modules 6, 39, 43, 52, 60, 117, and 71.

RELATED FILES

The egrep command can read from the standard input or a specified list of files. It writes to the standard output.

RETURN CODES

The return code is set to 0 if any patterns were found. If no patterns were found, the return code is 1. If syntax errors are present in the pattern or if files are not accessible, a return code of 2 is returned, even if patterns were found.

APPLICATIONS

You can use egrep to search text for a specific word or string of characters. The text may be contained in a file or be from the standard input. The various uses of grep include finding lines or files that contain a specific pattern, counting the number of lines that contain the pattern in a file, showing the line numbers of lines that contain the pattern, and showing the block that contains the pattern. These uses may be inverted to function on all lines that do not contain the specified pattern also.

The uses of egrep are far reaching, from just searching a file for lines that contain a pattern to using egrep inside shell scripts to verify a pattern exists. It is one of those commands that doesn't seem special until you use it a few times, then you can't seem to live without it.

Because egrep is a combination of the fgrep and grep commands, some users choose to use only the egrep command. There is no real problem with this practice except for the speed of egrep compared to grep and fgrep. The decision is yours and depends on the requirements that you have each time you use one of the commands.

TYPICAL OPERATION

In this activity you use the egrep command to locate your login in the passwd file. Various pattern examples are also used to show you how to use the regular expressions. Begin at the shell prompt.

1.  Type egrep "^$LOGNAME" /etc/passwd and press Return.
C Shell
If you are using the csh, replace $LOGNAME with $USER.
2.  Type egrep "$HOME|HOME" .profile and press Return to search for the string "/u1/ts/mylogin" ($HOME) and the string HOME in your .profile file. The $HOME variable is expanded on the command line to be your home directory path. The HOME string is taken as a literal string.
C Shell
If you are using the csh, use the .cshrc and .login filenames in place of .profile.
3.  Type who | egrep "bill|nancy" and press Return. You may have to change the "bill" and "nancy" patterns to a different pattern, but this command returns if certain users are logged in on the system.
4.  Turn to Module 52 to continue the learning sequence.


Previous Table of Contents Next