Previous Table of Contents Next


Module 124
split

DESCRIPTION

The external split command splits a file into smaller files based on a specified number of lines. Each of these smaller files are equal in size, with the exception of the last one created. It is the remainder of the original. Your original file is not changed by split.

The split command reads input from a file or the standard input and creates multiple output files. Each file contains n lines of the original. If you do not provide n, split uses the value of 1000.

If you provide the newname argument, the destination files are named newnameXX. Where XX is aa for the first file, ab for the second, and continues until the file zz. That's a total of 676 files you can generate if you divide your input into small enough sizes. When using newname, you must use a name two characters shorter than the maximum allowed for filenames. Maximum filename length is 100; therefore, you can only use filenames of 98 characters for newname. If you do not provide a newname argument, the destination files are named xXX. split uses the x as a newname.

COMMAND FORMAT

The general format of the split command follows.

     split [ -n ] [-] [file [new_name]

Options

The following option may be used to control how split functions.

-n Causes the output files to contain n lines.
If n is not specified, split uses 1000. The input is split into files containing 1000 line each.

Arguments

The following list describes the arguments that may be passed to the split command.

- Causes split to read from the standard input.
file The name of the file split reads and divides into n or 1000 line files.
If no file is given on the command line, split will read from the standard input.
newname The base part of the name used for all output files. An extension is added to newname for each file created. The extension is made up of two alpha characters. The first file extension is "aa," then "ab," and so on until the original input is completely divided.
If newname is not specified, the output is written to a file with a base part of "x" and the normal extensions. Thus the default output filenames are xaa, xab, and so on.

RELATED COMMANDS

Refer to the csplit command described in Module 28.

RELATED FILES

split places its output in files with an extension of two characters. The characters begin with "aa," the next file is "ab," and so on until the entire input has been split and stored in multiple files.

APPLICATIONS

You may find the split command helpful in dividing large data files into smaller, more manageable files. Since the extensions added by split are a sortable sequence, you can process the new files using shell looping structures.

There are some commands that cannot handle extremely large files; therefore, you may have to split the input for these commands into more manageable blocks.

TYPICAL OPERATION

In this activity you use the split command to divide the standard input into separate output files. Begin at the shell prompt.

1.  Type Is /bin | split -20 - bin and press Return. Notice no output is returned from the split command.
2.  Type Is -x bin* and press Return to display the new files created by the split command. Each of these binaX files contain a portion of the output from the ls /bin command.
3.  Type cat binaa and press Return to display the contents of the first output file created by split.
4.  Type rm bina* and press Return to remove all of the files beginning with bin.
5.  Turn to Module 28 (SV), Module 69 (BSD) to continue the learning sequence.


Previous Table of Contents Next