Main index

Introducing UNIX and Linux


More on shells

Overview
Simple arithmetic
      Arithmetic expansion
            Operators for arithmetic expansion
      The 'expr' command
Pattern matching
      Patterns
            Examples of patterns
      The case statement
Entering and leaving the shell
More about scripts with options
Symbolic links
Setting up terminals
Conventions used in UNIX file systems
Summary
Exercises

Patterns

Using a notation known as pattern matching, we can consider concepts such as 'all files with suffix .c', or 'all arguments to the command that are three characters long and commence with a digit'. Pattern matching is used in several situations by the shell, and we shall introduce those particular instances as we meet them.

If the shell encounters a word containing any of the following symbols (unless they are 'escaped' by being preceded by a backslash or contained within (single) quotes)

    ?  *  [ 

then it will attempt to match that word with filenames, either in the current directory, or absolute pathnames (if they commence with/). A ? matches any single character, * matches anything at all, and [ introduces a list of characters it matches. If the word commences with a * or a ?, it will only match filenames in the current directory not commencing with a dot. When the shell has worked out which filenames the word matches, it will replace the word by all those names. Try:

echo *

Since * matches anything, it will match any files in the current directory, and the resulting output will be similar to that from ls, although it won't format filenames into neat columns, and the output might be longer than your terminal is wide. Suppose you have a file mycommand, the try

echo m*

Since m* matches all filenames in the current directory commencing with m, all those filenames will be displayed, including mycommand.

Worked example

Use ls -ld to list all 'dot' files in your home directory.
Solution: Use ls -ld, but instead of giving it argument ~ or $HOME to list files in your home directory, you must isolate only those whose names commence with a dot. The 'dot' files in your home directory will each be matched with either ~/.* or $HOME/.* and one solution is therefore:

ls -ld $HOME/.*

A * will match any number of characters, a ? will match one single character, but is otherwise used in exactly the same way as *, so

echo ????

will display all filenames in the current directory that have 4 characters in their names (but do not commence with a dot). Pattern matching does not extend to subdirectories of the current directory, and ??? would not match a/b.

Worked example

How many directories or files located in the root directory have names three characters long?
Solution: Use pattern matching and ls to select the files and wc -w to count them.

ls /??? | wc -w

Many files on a UNIX system come equipped with a specified suffix - that is, a sequence of characters at the end of the filename. Some specific suffixes are .c for C programs and .o for files containing object code. Some also give meaning to other parts of their filenames - look at the files in /lib, for instance, which contains files of the form libsomething.a and are library files used by the C compiler. Pattern matching is useful for isolating files whose names you know to be of a specific 'shape'.

Worked example

Display detailed information on all files in the current directory with the .c suffix.
Solution: Using ls -l, we need to give it as arguments those files with suffix .c, and the pattern *.c will match precisely those files:

ls -l *.c

Between symbols [ and ] comes either a list of characters, or one or more ranges of expressions, possibly preceded by the ! (exclamation mark) character. A range, which is denoted by two characters separated by a hyphen, means all those characters that are lexically between (and including) those two characters. Thus [m-q] matches any lower-case letter between m and q inclusive. Note that the character to the left of the hyphen in a range must lexically precede the character to the right or the range matches nothing. The ! indicates that the word will match any single character not specified between the brackets.

Worked example

List all commands stored in /bin whose names consist of two characters, the second one being a vowel.
Solution: Use ls with an argument which will match this pattern. ? matches a single character, and [aeiouAEIOU] matches any vowel, thus:
ls /bin/?[aeiouAEIOU]

WARNING: rm * deletes all files in your current directory - be careful using patterns with rm.

We shall use pattern matching later on in this chapter in the context of case statements, and you should remember that it is a much more powerful tool than simply for checking filenames. In the meantime, using ls followed by a pattern is an excellent method of getting used to pattern matching. Remember that *, ?, [ and ] all involve patterns, and that if you use them in a script and don't want them to relate to patterns, they must be escaped using \ or single quotes. In later chapters we shall introduce a similar concept to pattern matching, known as regular expressions.

Worked example

Create a script to remove all files with suffix .o in the current directory, prompting you for each one as to whether you do in fact wish to delete it, and confirming whether or not it has been removed.
Solution: These files are matched by *.o, and we can pass the files one-by-one to rm -i using a for loop. rm yields exit status 1 if it fails to remove its argument.

for i in *.o                     # Loop through files
do
  if   rm -i $i                  # If deleted ...
  then echo File $i deleted      # confirm this ...
  else echo File $i not deleted  # otherwise not
  fi
done

We could not simply have used

rm -i *.o

since we would then have been unable to generate the 'confirmation' message.


Copyright © 2002 Mike Joy, Stephen Jarvis and Michael Luck