Main index

Introducing UNIX and Linux


Files

Overview
The UNIX directory hierarchy
Filesystems
Manipulating files
      Creating directories
      Creating files
      links
      'Dot' files
Protecting files
      Groups
      File access control
      Changing privileges
File contents
      Text files
      Comparing files
      Filtering files
      Non-text files
Printing files
File archives and file compression
Other relevant commands
Summary
Exercises

The UNIX directory hierarchy

A typical UNIX system has many users and usernames. The machine stores large numbers of programs and datasets that are either 'system' files (required for the running of UNIX) or files for the benefit of the system's users (such as the UNIX commands we discuss in this book). In addition, each user has their own collection of files. On a large UNIX system it would not be unreasonable to expect to find millions of files occupying thousands of gigabytes of space (where a gigabyte is a unit of storage equal to 1024 megabytes).

If I choose to create a file called myfile (say), it is unlikely that I will be the only user on the machine to have chosen that particular name for a file. It would be unreasonable to expect me to choose a filename instead of myfile that was different to all files created by all the other users. Therefore, UNIX must impose a structure on the filespace that will make it easy to manage a large number of files. The solution adopted is simple yet very powerful.

We can think of the available file storage for our machine as partitioned into separate directories. At any given time you can access files in one particular directory, which we can think of as the current directory. You can also 'move' between different directories and so change which is current. A directory need not be a contiguous section of disk, and might be fragmented. That is, the various files contained within this storage area that we call a directory may in fact be physically located on different parts of a disk, or even on completely different disks or storage devices. This does not matter to the user - the logical structure of the machine's memory is important, not how it is physically implemented. In order for the machine to know how to find the data in these directories, each has a file, called dot and referred to by the 'dot' symbol (.) that stores information about that directory (such as which files are stored within it, how big they are, and precisely where on disk they are stored). The word directory is also used to describe a file such as dot, which contains the vital statistics for a directory storage area. Since the physical layout of a directory is not important to us, this dual meaning for the word presents us with no ambiguity.

Within a directory are files, some of which may themselves be directories. Directories are organised in a tree-like structure. At the base of the tree is a directory whose UNIX name is '/' ('root'). So, we might have the following situation:

Directory hierarchy

In each directory, in addition to the file dot, is a file called dotdot, referred to by the symbol '..', which is a synonym for the parent of that directory in the tree. Since a file dot and a file dotdot exist in each and every directory, we do not usually mention them when describing a UNIX directory hierarchy. There are two means by which we may refer to the name of a file. Either we can name it relative to our current directory, in which case we need only use its simple name, such as myfile. Alternatively, we can use its absolute filename relative to the root. In this latter case, its name commences with a / ('slash' or 'solidus'), followed by the intervening directories between the root and the file separated by /s, and finally with the filename. Thus in the above tree, file myfile has absolute name /home/ugrad/chris/myfile. If a filename commences with the character / then it is an absolute name, otherwise it is relative. Each file thus has a unique absolute filename. Moreover, since these filenames can be as long as required (within reason - each system has a limit) and the depth of the tree can be as great as needed, we can cope with a UNIX system containing as many files as desired. Since the current directory has several names, there will be several names for an individual file; if the current directory is /home/ugrad/chris then the following names all refer to the same file, since you can insert /. after any intermediate directory name without affecting the meaning:

/home/ugrad/chris/myfile
myfile
./myfile
../chris/myfile
././././././myfile
../../../home/ugrad/./chris/myfile

When logged in to the machine, you are always in a current directory somewhere. When you initially log in, you start in your home directory in which you can create your own files. This directory has a synonym, ~ (tilde), which you can use whenever you need to refer to your home directory. To find your current location within the file system use the command pwd ('print working directory'). For example,

pwd
/home/ugrad/chris
$

It is not always convenient to have your home directory as the current directory, since this might involve much typing of absolute filenames if you wish to access a file elsewhere. By means of the command cd ('change directory') you can move around the filesystem. By typing cd followed by the name of a directory, you can make the directory become the current directory (if it exists - if not, an error message will be output and your current directory will not change). For instance, to move to user sam's home directory, and then to a non-existent directory called /squiggle:

cd /home/ugrad/sam
cd /squiggle
/squiggle: No such file or directory

You may also want to know what files exist on the machine. The command ls ('list') which we have already met will accomplish this. By default, ls lists the files in the current directory; if, however, you give ls one argument that is the name of a directory (either relative or absolute) the files in that directory will be listed. For instance:

ls /
bin etc tmp usr lib home
cd /bin
pwd
/bin
ls
date ls man

$

Try this on your own machine. The output will not look exactly the same, and there will be many more files that are listed. If you give ls an argument that is an ordinary file, not a directory, just that filename will be displayed. Do not be afraid of 'getting lost' by changing to different directories - you can always return to your home directory by typing cd with no arguments (alternatively cd ~). Since ~ always refers to your home directory, you can always refer to files relative to that directory, so if ~ is /home/ugrad/chris, then /home/ugrad/chris/myfile could equally well be referred to as ~/myfile If you follow ~ by the name of the user, it refers to that user's home directory - so if you are chris then ~ is equivalent to ~chris, and sam has home directory ~sam.

Worked example

What files does sam have in their home directory?
Solution: Use ls followed by the name of sam's home directory:

ls sam

When a file is created, space to store it is found on the machine. That space is given a unique number, called an inode (pronounced 'eye-node'), which remains with that file until it is eventually deleted. At creation, the file is also given a name. The file is created in a directory, and at creation the directory is updated so that it contains the name of the file and the inode where that file is stored.


Copyright © 2002 Mike Joy, Stephen Jarvis and Michael Luck