Previous Table of Contents Next


Module 140
tr

DESCRIPTION

The external tr command is used to translate or delete characters. It reads from the standard input and writes to the standard output. It cannot read or write to or from files; therefore, you must use the shell's redirection symbols or pipe the input to tr.

COMMAND FORMAT

Following is the general format of the tr command.

     tr [ -cds ] [ string1 [ string2 ] ]

Options

The following options may be used to control how tr functions.

-c Complements the set of characters specified in string1. It causes tr to use the characters not specified in string1 as string1. For example,
                    tr -c "[a-z][A-Z][0-9]" "[^*]" < infile
would replace all non alphanumeric characters with a caret (^).
-d Delete the characters listed in string1 from the input.
-s Squeezes consecutive occurrences of a character down to one character. For example,
                    tr -s < infile
replaces an input of "aaaaafile is a sssset" to "afile is a set."

Arguments

The following arguments may be passed to the tr command.

string1 String of characters to be searched for on input and replaced by string2 on the output.
string2 String of characters used to replace the characters listed in string1.

The following conventions may be used in string1 and string2 to match ranges of characters or repeated characters.

[c1-c2] Represents any characters whose ASCII codes are from character c1 to character c2. For example, to represent characters from a through z on the input and capital A through Z on the output, you type
                    tr "[a-z]" "[A-Z]" < infile
[c1*n] Represents n repetitions of character c1. Character c1 may be any ASCII character code from 001 to 377 octal. If the first digit of n is 0, n is considered to be octal. If n is 0 or omitted, tr considers it "huge" and assumes that n is the length of string1. Thus c1 is repeated until all characters in string1 have been replaced by c1. The following examples illustrate the use of this convention.
changes any occurrence of letters "a," "b," "c," "d," or "e" to a letter "X."
\nnn The backslash character may be used to precede numbers that represent characters in the ASCII collating sequence. The number nnn may be an octal number from 001 to 377. This notation is useful when you need to represent a control character or a character not on your keyboard.

FURTHER DISCUSSION

Since the tr command cannot read or write to or from files, the typical command resembles the following one.

                    tr "[a-z]" "[A-Z]" < infile > outfile

The standard input has been redirected by the shell to be the file infile. The standard output has also been redirected; it is sent to file outfile. This command converts all lowercase letters to uppercase.

The string1 argument provides a list of characters to be replaced. The string2 argument provides the replacement characters. The first character of string2 is used to replace the first character of string1 and so on for all the characters in string2. The following example,

                    tr abc xyz < infile

replaces each occurrence of a with x, each b with y, and each c with z.

If the string2 contains fewer characters than string1, then the remaining characters in string1 are not affected. For example,

                    tr abcde xyz < infile

changes a, b, and c to x, y, and z respectively but does not change d or e.


NOTE:  
Older versions of tr used the last character of string2 to replace all remaining characters of string1.



Using brackets to specify ranges of characters is a handy convention. You can specify a large set of characters using only a few characters. For example,

                    tr "[A-Z]" "[a-z]" < infile

translates all uppercase letters ([A-Z]) to lowercase ([a-z]).

DIAGNOSTICS AND BUGS

The ASCII NULL character (\000) is deleted from the input. It cannot be placed in string1 or string2.

RELATED COMMANDS

Refer to the sed command described in Module 117.

RELATED FILES

The tr command reads from the standard input and writes to the standard output. It cannot read or write to a file on its own. You must use the shell's redirection symbols to read and write to and from files.

APPLICATIONS

The most common use of tr is to convert files from all uppercase to all lowercase or vice versa. It can also be used to translate or delete strings of characters from an input stream.

TYPICAL OPERATION

In this activity you use the tr command to remove all punctuation from a file. Begin at the shell prompt.

1.  Type tr -d "[!-/][:-@][\[-_][{-~]" < /etc/passwd and press Return. The passwd file should be displayed with only characters and numbers in it. All of the punctuation is removed.
The next activity converts the input text to one-word-per-line output.
2.  Type tr -cs "[A-Z][a-z]" "[\012*]" < /etc/passwd and press Return. The -c causes all characters other than a-z and A-Z to be changed. The -s causes multiple occurrences of the new-lines to be reduced to one occurrence. The \012 is the ASCII new-line character. Notice your display is a single column of words.
3.  Turn to Module 26 (SV), Module 4 (BSD) to continue the learning sequence.


Previous Table of Contents Next