Previous Table of Contents Next


Module 54
FILENAMES AND FILENAME GENERATION

DESCRIPTION

This module describes filenames and special characters used for filename generation. It is important to understand file naming conventions and how to use them before learning filename generation techniques.

A filename is a string of characters representing an ordinary file, a directory file, or a special file. No matter what type of file is referenced, a filename will be used to access the file.

When naming files, use meaningful filenames. It is important to know what is in each file. By using meaningful filenames you will be able to select the file you need by viewing a list of filenames. Unfortunately, the more files you have the more difficult it becomes to keep these filenames meaningful.

At some point you will want to make subdirectories that have meaningful names. You can store related files in the proper subdirectories. It is advisable to do this when you first begin using UNIX. If you plan on having a large number of files that are related, you may want to write shell scripts that control the files for you.

For example, you name all files with a three-digit filename. Inside each file place a unique string, such as @(#), followed by a description of the file. Then use the grep command to generate a listing of the filenames and their related description. If you use the @(#) string, the what command will perform this listing function for you.

The UNIX system has de facto standards for special files and has many commands you should be aware of before naming one of your shell scripts the same name. To check for existing programs use the whence or find command as follows:

  cj> whence command                    # see if command exists in your PATH
  cj> find / -name file -print > TMP &  # search entire system for file

The use of the find command is not advisable on systems with large amounts of disk.

Filename Extensions

The UNIX system also has de facto standards for filename extensions. Most extensions are one character long. Although the period (.) does not have any special meaning in a filename, except in the first position, it is used for convention. These are not hard and fast rules of UNIX, although some programs require the proper extensions. The following list provides a brief description of each extension.


Filename Extensions

.a Assembler progamming language source code file
.c C programming language source code file
.f Fortran progamming language source code file
.o A compiled object file
.h C programming language header file
.sh Shell programming language source code file
.z A packed file

File Naming Rules

You may name your directories and files any name you wish as long as you remember the following rules.

1.  The name must be from one to 256 characters. Extra characters are ignored by the system. Keep filenames meaningful but short.
2.  All characters are legal except for the slash (/). It is used to separate directory levels and files.
3.  It is advisable to avoid the following list of characters since they are interpreted by the shell and other programs:
@ # $ ^ & * ? ( ) [ ] { } / \ | ; ' " < >
Space Tab Esc Ctrl-Characters
Spaces and tabs must be quoted if used as part of filenames.
4.  A period (.) at the beginning of a filename hides it from normal ls type referencing.
5.  Uppercase and lowercase characters are interpreted as different characters. A file named TMP is different from a file named tmp.
6.  It is advisable not to use any of the following characters at the beginning of a filename:
= + - = _

NOTE:  
Releases of System V prior to Release 4 only support filenames of 14 characters.



FILENAME GENERATION

Filename generation provides a shorthand for referencing files and directories. The shell scans each word of the command line for filename generation characters. For every one of these characters the shell attempts to expand it to a list of matching file and directory names based on the criteria in the following wildcard section. If no matches are found then no expansion of the word is done.

Wildcards

The wildcards used in filename generation function much like wildcards in a poker game. In poker, if a card is wild then it may be used as any other card in the deck. If "deuces" are wild then a player can make three of a kind by having a deuce and two Aces. The one difference in the shell's wildcards is the * (asterisk); it is used to expand into zero or more characters instead of a one-to-one match. The following is a list of patterns (wildcards) that may be used for matching file and directory names.


Wildcards

* Matches any string with zero or more characters.
? Matches any single character.
[chars] Matches any one of the characters enclosed within the brackets. The c1-c2 notation will match characters c1 through c2.
[!chars] Matches any one of the characters not enclosed within the brackets.
?(pattern[|pattern...]) Optionally matches any pattern. Multiple patterns may be listed, separated with vertical bars (|). The brackets [ ] show optional patterns.
*(pattern[|pattern...]) Matches zero or more occurrences of any pattern.
+(pattern[|pattern...]) Matches one or more occurrences of any pattern.
!(pattern[|pattern...]) Matches everything except any pattern.
@(pattern[|pattern...]) Matches exactly one occurrence of any pattern.
^[user] Expands to value of $HOME. If user is specified expands to user's home directory.

C Shell
The csh does not support the ?(pattern), *(pattern), +(pattern), !(pattern), or @(pattern) formats. However, the C shell adds the notation of {literal1,literal2,..}. You can use this notation to match various literal strings within filenames. The matching is performed left-to-right, with the order of the output being sorted for each group of files that match. It becomes useful when you want to match several different strings within filenames.

The following examples provide insight into how filename expansion works.

     cj> ls *.db          # list all files ending in .db.
     cj> ls [0-9]*        # list all files beginning with a number between 0
                        and 9.
     cj> echo [!0-9]*     # display all files not beginning with a number.
     cj> ls *.[chza]      # display all files ending in c, h, z, or a.
     cj> ls !(m*)         # list all files not beginning with m.
     cj> ls +(s*)         # list all files beginning with s if more than one
                        file exists
     cj> ls file*([123])  # list file followed by any number of digits from
                        1 to 3.
     cj> ls file@([123])  # list file1, file2, and file3.

C Shell
cj> ls api{327,525}0.c # list all files beginning with api and
# ending with 0.c that have 327 or 525 in
# between.


NOTE:  
If too many matches are found, the shell will expand the filenames and the command will not be able to handle all of the arguments. The command will abort with a message saying "too many arguments" or "stack overflow" or "out of space."



Literals

The literals used in filename generation must be matched exactly. Files beginning with . (period) must be matched by using a period to begin the reference to the filename. Matching the / (slash) in directories must also be done by using the slash. The following table explains the use of literals in filename generation.


Literals

.file Match a hidden file; file may be composed of filename generation characters.
d/d/f Match directories (d) and files (f) in a path. Each directory and file may contain filename generation characters.
/ Must be used to match all / characters in a pathname.

The following examples help explain the use of literals in filenames.

     cj> ls *.db         # list all hidden files ending in db.
     cj> ls */*          # list all files contained in subdirectories.
                         # does not match any files in current directory.

NOTE:  
To perform filename expansion for directories, you must specify a match for each level of directories. The */*/* command will not match files in the */* directory.



FILENAME COMPLETION

The ksh shell provides a filename completion feature. It is performed by the in-line editor. If you are using the vi in-line editor, simply type your command and part of your filename. Press Escape and type an *. The shell expands the partial filename into all matching filenames. The cursor is moved to the end of the line and you are placed back in insert mode on the command line. For example,

     cj> ls m<Escape>*

generates

     cj> ls memos misc myph_

C Shell
The C shell provides a filename completion feature. This feature is enabled by setting the filec variable. Refer to the set command for information on setting variables. When filename completion is enabled the csh completes filenames and user names during interactive sessions. The partial filename is input from the keyboard and Escape is pressed. The csh expands the partial filename into a completely unique filename. If multiple filenames match the partial string, then the filename is expanded to the end of the common part of the filename. For example, if the current directory resembles
cj> ls
bin
bench1
bench20
do_list
mailbox
tester_prog1
and you type
cj> te<Esc>
the csh expands te to tester_prog1. But if you type
cj> be<Esc>
then the csh expands be to bench and beeps your terminal to signal you that completion is incomplete because two filenames matched be.

APPLICATIONS

Filenames are used to reference data stored on disk. Therefore, you should use meaningful filenames to explain the function or contents of your files.

The wildcards and literals provide for easy referencing of filenames. Wildcards and literals may be used to restrict your referencing to groups of files having common characteristics. They are often used to display, copy, delete, or rename files or groups of files.

The literals are used to reference hidden files or directories and subdirectories. They provide an extension to the files not usually listed in the current directory.

TYPICAL OPERATION

In this activity you use various UNIX commands to perform filename generation. Begin at the shell prompt in your HOME directory.

1.  Type ls -C / and press Return. Notice the / (root) directory is displayed.
2.  Type ls -C /* and press Return. Notice the output is different. Each directory under the / directory has been listed. This probably takes a while to list out.
3.  Type cp f* db and press Return to copy all files beginning with f to the db directory.
4.  Display the db directory by typing ls db and pressing Return.
     cj> ls db
     file1
     file2
     phone
5.  Turn to Module 113 to continue the learning sequence.


Previous Table of Contents Next