Main index

Introducing UNIX and Linux


Files

Overview
The UNIX directory hierarchy
Filesystems
Manipulating files
      Creating directories
      Creating files
      links
      'Dot' files
Protecting files
      Groups
      File access control
      Changing privileges
File contents
      Text files
      Comparing files
      Filtering files
      Non-text files
Printing files
File archives and file compression
Other relevant commands
Summary
Exercises

Text files

We now concentrate on a particular class of files, text files, which are files divided into lines separated by a Newline character. Such files, which would normally contain only printable characters, include text program source files, shell scripts, and in fact any files you would wish to use a text editor on. However, this is not a requirement, and what we discuss in this section will also hold true for files containing other characters. Most of the files we will use as examples in the rest of this book will be text files.

Suppose we have created a file called story, which contains English text. Having established that it is a text file (by means of file or otherwise), we may wish briefly to examine its contents. We could, of course, invoke an editor such as vi and use the commands within the editor to move through the file and look at various parts of it, or we could use a pager. However, there are easier methods.

Often you will simply want to look at the first few lines of a file (for instance to verify that it was indeed the file you expected it to be). In this case, head will print out the first 5 lines of the file. In a similar vein, tail will print out the last 5. If you want to see the first (say) 10 lines, then the command would be

head -n 10

where n = 'number'.

Strictly speaking, tail copies its input to standard output beginning at a designated place, which is usually a number of lines from the end of the file. There are many options available to tail to allow you to specify what is meant by the designated place, and how many lines are output - refer to the manual page for further details.

As an example of simple use, suppose file myfile contains 100 lines, as follows:

line 1
line 2
...
line 100

Then we might have

head -n 2 myfile
line 1
line 2
tail -n 3 myfile
line 98
line 99
line 100
$

Worked example

Find the most recently modified file (excluding 'dot' files) in the current directory.
Solution: This has clearly somehow got to involve ls. With option -l we could examine by hand every file and see which one was last changed. That is not the UNIX way of doing things - by examining the manual page for ls we find an option -t ('time') which will sort the files it prints out so that the most recent is shown first. Option -1 (1 is digit 'one') forces the output to be one filename per line. Thus ls -t1 will produce a list of filenames with the desired one at the top - use head to isolate it by piping the output of ls to head thus:

ls -t1 | head -n 1

There is only a limited amount of space on a machine, and it may be that each user has been restricted (by the system administrator) as to how much filespace they are allowed to use. A quick way to find out how big files are is to use wc ('wordcount'), which indicates (i) the number of lines, (ii) the number of words and (iii) the number of characters (bytes) in a file. The latter two are only meaningful if the file is a text file, though. For example,

wc myfile
   27 124 664 myfile

indicates that file myfile has 27 lines, 124 words and 664 bytes (characters). With options -c, -w or -l respectively, only the byte, word or line count will be printed. Note that wc does not work on directories.


Copyright © 2002 Mike Joy, Stephen Jarvis and Michael Luck