Main index

Introducing UNIX and Linux


Awk

Overview
What is 'awk'?
Invoking 'awk'
Naming the fields
Formatted output
      Operators used by Awk
Patterns
Variables
      Accessing Values
      Special variables
Arguments to 'awk' scripts
Arrays
Field and record separators
Functions
      List of Awk functions
Summary
Exercises

Patterns

In the previous examples, we have performed a task on every line of the standard input, by using a null pattern. There are two simple patterns that are very useful - they are called BEGIN and END. An action associated with pattern BEGIN will be executed once, when the awk script starts and before any lines are read from the standard input. The action associated with END is performed after all other actions, and immediately prior to awk terminating. The following Awk script will copy its standard input to the standard output, but also write Start of file at the beginning of the output, and End of file at the end:

BEGIN { print "Start of file" } # Done at the start
{ print $0 }                    # for each line of input
END { print "End of file" }     # done at the end

Just as in shell scripts, comments can, and should, be inserted into Awk scripts

Try this with the input coming from vegetables. More generally, many sorts of pattern are available. An ERE enclosed between slashes (/) is a pattern which will match any line of input matched by that ERE. So to print the cost per kilo of every vegetable whose name commences with a vowel, we could have

/^[aeiou]/ { printf "%s costs %.2f per kilo\n", $1, $2 }

The pattern specified by the ERE normally applies to the whole record. It can be restricted to a single field by preceding the ERE with the field number and a tilde. Since in the above example we are interested in the first field commencing with a vowel, we could restrict the pattern match to the first field thus:

$1 ~ /^[aeiou]/ { printf "%s costs %.2f per kilo\n", $1,$2 }

The behaviour of Grep can be mimicked by Awk - the following two shell commands have the same effect:

grep -E 'ERE'
awk '/ERE/ { print $0 }'

The pattern can also be an expression that evaluates to true or to false. The following displays the cost per kilo of all expensive (more than 1 pound per kilo) vegetables:

$2 > 1.00 { printf "%s costs %.2f per kilo\n", $1, $2 }

Worked example

Display the total costs for vegetables only if that cost is at least 2.50.
Solution: For each line, evaluate the total cost ($2*$3), and perform printf if that value is greater than or equal to 2.50:

$2*$3 >= 2.50 { printf "%s cost %.2f\n", $1, $2*$3 }

More complicated patterns can be constructed using && ('and') and || ('or'). These are boolean operators:

expression1 && expression2

is true if both expression1 and expression2 are true, whereas

expression1 || expression2

is true if either expression1 or expression2 is true, or if both are true.

Worked example

Display the names of each vegetable purchased which either cost at most 1 pound per kilo, or for which less than 1 kilo was purchased.
Solution:

$2 <= 1 || $3 < 1 { printf "%s\n", $1 }

Copyright © 2002 Mike Joy, Stephen Jarvis and Michael Luck