Managing filesystem types

Data integrity and caching disk controllers

NOTE: In the following discussion, the term ``caching disk controller'' refers to items such as caching ESDI controllers, caching IDE disks, and caching SCSI host adaptors.

Sometimes a filesystem must perform an operation in several steps that it would ideally like to perform in one step. For instance, creating a file requires the following steps:

Each of these steps requires writing to a different part of the disk, so if the sequence of writes is interrupted, filesystem damage occurs. For most multistep operations, the order in which the writes to disk occur is important to minimize the loss of filesystem consistency if the sequence is interrupted.

Because caching disk controllers can delay and/or change the order in which data is written to the disk, independent of the filesystem's intentions, a filesystem's data or structure can become corrupted if the writes are interrupted. Filesystem corruption by this means can occur with any filesystem on any operating system, whether it be DOS, OS/2, UnixWare, or some other operating system.

Avoiding filesystem damage

A filesystem (and the disk as well) can be severely damaged if one of the following events happens while data is being written to the disk:

To avoid filesystem damage, here are some suggestions (these suggestions apply whether or not you have a caching disk controller):

If you have a caching disk controller, carefully read the documentation that came with it. If the cache is only used for reading, or it is a write-through cache, you need not worry about consistency problems related to the caching disk controller. If the cache is a write-back cache, consider the issues raised in the rest of this section.

With caching controllers, there is a trade-off of increased performance versus risk of corruption. The risk is related to how often the disk is interrupted while flushing data from the cache to the disk. If you cannot accept this risk, you can take one of the following actions (all decrease the performance of the disk controller).

Administering filesystems on caching disk devices

Normally, filesystem damage is automatically repaired for you when you bring up the computer. If the ordering is changed by a caching disk controller and then interrupted before all of the cache is written to disk, the filesystem might be fooled into thinking that all is well when damage is present.

The only way this can happen for s5, ufs, sfs, or bfs is if the superblock is stamped clean before other writes have completed. A superblock is stamped clean only when it has been unmounted or when fsck(1M) has been run on it. When the computer comes back up, fsck is not run in this case because the filesystem appears clean. In all other cases, the filesystem does not appear to be clean, and fsck checks the entire filesystem.

With vxfs, however, things are more problematic. Remember that certain operations that appear as a single step or write to disk to the user actually involve several steps or writes to disk. vxfs logs these steps in the intent log on the disk before making these steps. If the operation is interrupted, fsck for vxfs does not have to check the entire filesystem. Instead, it only has to check the intent log to see what steps need to be done to complete an operation, or what steps need to be undone to undo an operation. Thus, because the disk is not usually checked at all (or is checked only superficially), the fsck for vxfs is very fast.

This mode of fsck, however, depends on the order of writes to disk, which a caching controller might change.

Do you suspect filesystem damage?

If you suspect filesystem damage that the computer is overlooking, do the following:

© 2004 The SCO Group, Inc. All rights reserved.
UnixWare 7 Release 7.1.4 - 22 April 2004