Administering filesystems

How UNIX systems maintain files and filesystems

Filesystem data is not stored on the hard disk in locations that correspond to individual files. On the contrary, the data is probably scattered across the disk. The data is spread around because the operating system does not really deal with files, but rather with units of data. For example, when you create a file, it might be stored on one part of the disk. If you edit that file and delete a few sentences here and there, you now use less disk space than you did before. This space amounts to a series of gaps in the area where your file was stored. Because disk space is a precious commodity, the system allocates those small amounts of disk space to other files.

Each filesystem contains special structures that allow the operating system to access and maintain the files and data stored on the filesystem:

Data blocks
A ``block'' is a 1024-byte unit of data stored on the disk. (Some filesystems use variable block sizes to maximize use of space.) A data block can contain either directory entries or file data. A directory entry consists of an inode number and a filename.

An ``inode'' (information node) contains all the information about a file (except file data), including its location, size, file type, permissions, owner, and the number of directory entries linked to the file. The inode also contains the locations of all the data that make up a file so the operating system can collect it all when needed. The only information the inode does not contain is the name of the file and the contents; directories contain the actual filenames.

One special data block, the ``superblock'', contains overall information about the filesystem, just as the inode contains information about a specific file. The superblock contains the information necessary to mount a filesystem and access its data, including the size of the filesystem, the number of free inodes, and information about free space available. When the filesystem is mounted, the system reads information from the disk version of the superblock into memory.

To minimize seeking data on the hard disk, recently used data blocks are held in a cache of special memory structures called ``buffers''. Buffers make the operating system more efficient. Depending on the filesystem type and the setting of kernel parameters, the buffer cache is ``flushed'' (written to the disk) at set intervals.
Certain configurable filesystem mechanisms affect how transactions are managed and committed. Some involve tradeoffs in performance against data integrity, others tradeoff performance against system recovery time.

Intent logging records filesystem transactions in a log and later commits them to disk. This increases system recovery speed with a very small performance penalty.

Sync-on-close ensures that all files modified by a process are written back to the disk when they are closed. This minimizes data loss in the event of power failure. This feature can significantly affect system performance because of increased disk writing.

Next topic: Maintaining free space in filesystems
Previous topic: Checking and repairing filesystems

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 22 April 2004