Introducing UNIX and Linux

Files

Overview
The UNIX directory hierarchy
Filesystems
Manipulating files
      Creating directories
      Creating files
      links
      'Dot' files
Protecting files
      Groups
      File access control
      Changing privileges
File contents
      Text files
      Comparing files
      Filtering files
      Non-text files
Printing files
File archives and file compression
Other relevant commands
Summary
Exercises

Other relevant commands

Many files have names containing a suffix, or a sequence of characters at the end of the name and commencing with a dot. For example, if you have a program written in the language C, the file in which that program is stored should have a suffix .c of necessity. Let us suppose you have written a C program that is stored in file myfile.c in your home directory /home/ugrad/chris. From the point of view of the UNIX kernel, it is irrelevant what name this file has. Only when you attempt to compile and run the program will the suffix become important, as the UNIX command for compiling a C program demands that the .c suffix be present, and indeed will create files with the same base myfile and different suffix. In this example, a file myfile.o would be created (the 'o' stands for 'object code', i.e. binary code for the processor). A standard POSIX command makes no demands on a file's suffix, although other utilities may well do so; the manual page for that command will tell you.

The command dirname takes as argument the name of a file and strips off the actual filename, leaving only the directories. Command basename also takes a filename as argument, but strips off the directory information leaving only the filename relative to its parent directory. If basename is given two arguments, and the second argument is a suffix of the filename, that suffix is also removed. For instance:

$ dirname /home/ugrad/chris/test.c /home/ugrad/chris $ basename /home/ugrad/chris/test.c test.c $ basename /home/ugrad/chris/test.c .c test

The benefit of these two commands will not be apparent at this stage, but later on, when writing shell scripts to manipulate files, they are exceptionally useful.

When the command mv is called, the directory in which its first argument is located is updated so that the file's absolute name is changed. The inode of the file is not changed if the new filename is on the same filesystem. This command name is somewhat misleading, since the file doesn't really move at all.

We also need to know the total amount of space taken up by our files. Here the command du ('disk usage') comes to our rescue. With argument of the name of a directory (or the current directory, if no argument is given), du prints the total number of kilobytes used to store the data in the files in that directory. For example,

$ du 12 ./dir1 7 ./dir2 27 .

indicates that directory dir1 takes up 12k (kilobytes) and dir2 takes 7k, whereas the total amount of storage used for the current directory is 27k (including dir1 and dir2).

There are some other standard commands that are not required for simple use of UNIX. Nevertheless, they are included within the standard, and are included here for completeness.

Suppose you have distributed some text files to a colleague, and you then make minor alterations to them. You want your colleague to have updated copies of the files. One possibility is to send them all the files anew, but this has the disadvantage that a potentially large volume of data must be transmitted, which may well incur costs. An arguably preferable method would be to send your colleague a list of the changes to the files. These changes can be displayed using diff, but it would be unreasonable for your colleague to edit all the files by hand to make the changes. Fortunately, the command patch is provided to perform the task automatically. It takes a file containing the changes, as generated by diff, and the name of a file to which those changes are to be applied, and carries out the changes.

The commands we have introduced in this chapter will be seen to perform only simple manipulations of UNIX files, especially when examining the contents of files. Three programs - Grep, Sed and Awk - which we discuss later in the book - provide comprehensive facilities for processing file contents, and obviate the need for more 'simple' UNIX commands over and above those mentioned in this chapter.