volrec(4)
volrec(4)
volrec - structure defining a volume record
Synopsis
#include <sys/types.h>
#include <sys/vol.h>
#define Name_LEN 14
#define COMMENT_LEN 40
#define UTIL_NUM 3
#define UTIL_LEN 14
#define Name_SZ (Name_LEN + 1)
#define COMMENT_SZ (COMMENT_LEN + 1)
#define UTIL_SZ (UTIL_LEN + 1)
struct volseqno { ulong_t seqno_lo, seqno_hi; };
typedef struct volseqno volseqno_t;
typedef struct volseqno volrid_t;
struct volrec {
struct v_tmp v_tmp; /* non-persistent fields */
struct v_perm v_perm; /* persistent fields */
};
Fields for the v_perm structure:
char v_name[Name_SZ]; /* record name */
char v_use_type[Name_SZ]; /* volume usage type name */
char v_fstype[FSTYPE_SZ]; /* guess of volumes fstype */
char v_comment[COMMENT_SZ]; /* comment field */
char v_putil[UTIL_NUM][UTIL_SZ]; /* persistent util fields */
char v_state[STATE_SZ]; /* utility state of volume */
char v_pref_name[Name_SZ]; /* plex name if V_PREPER */
char v_start_opts[V_STOPTS_SZ]; /* volume start options */
enum vol_r_pol v_read_pol; /* method of plex selection */
minor_t v_minor; /* minor number in disk group */
uid_t v_uid; /* owner of /dev/vol/name /*
gid_t v_gid; /* group of /dev/vol/name /*
mode_t v_mode; /* mode of /dev/vol/name /*
ulong_t v_pflag; /* persistent volume flags /*
long v_pl_num; /* associated plex count /*
volseqno_t v_update_tid; /* trans id of last update /*
voff_t v_len; /* byte length of volume /*
voff_t v_log_len; /* length of log area /*
volrid_t v_rid; /* unique identifier /*
volrid_t v_pref_plex_rid; /* preferred plex record ID /*
volseqno_t v_detach_tid; /* trans id of kernel detach /*
Fields for the v_tmp structure:
char v_tutil[UTIL_NUM][UTIL_SZ /* non-persistent util fields /*
long v_rec_lock; /* 1 if record is locked */
long v_data_lock; /* 1 if volume is data locked */
enum vol_kstate v_kstate; /* relation to file space */
enum vol_except v_r_some; /* if some plex reads fail */
enum vol_except v_w_all; /* if all plex writes fail*/
enum vol_except v_w_some; /* if some plex writes fail */
long v_lasterr; /* last volume error or 0 */
ulong_t v_tflag; /* non-persistent volume flags */
long v_log_serial_lo; /* log serial number/low part*/
long v_log_serial_hi; /* log serial number/hi part */
dev_t v_bdev; /* block dev for volume */
size_t v_iosize; /* minimum size for raw I/Os */
voff_t v_rwback_offset; /* read/write-back offset */
Description
A volrec structure is the structure used to communicate volume record information between the volume configuration daemon, vxconfigd, and programs using the Volume Manager library to query for configurations and to make configuration changes.
The two structures contained in the volrec structure differentiate elements of the volume record that are persistent and that are non-persistent. The division of fields between v_tmp and v_perm structures is somewhat historical, however the v_perm structure contains information that is stored persistently (i.e., fields that are recovered unchanged after a system reboot), or is directly derivable from persistent volume record information. The v_tmp field, on the other hand, contains fields that can be modified without the changes being stored persistently.
The uses of the various volume fields are defined as follows:
- v_name
- The volume name. This field cannot be changed directly, although it can be changed by calling
vxvm_rename.
- v_rid
- This is a 64-bit record ID assigned to the volume record, which is unique within the disk group for
the duration of existence for the disk group. This does not change as a result of a
vxvm_rename, even though the record name changes.
- v_use_type
- The usage type associated with the volume. This is used to select a utility set that maintains state
and plex consistency in a manner appropriate to the usage of the volume.
- v_fstype
- The file system type of any file system residing on the volume. A usage type may choose to use or
ignore this field.
- v_comment
- A null-terminated comment string associated with the record. The contents are arbitrary except that
they cannot contain a new line.
- v_putil
- An array of three null-terminated strings that can be used as scratch pads by utilities. These fields
are preserved across reboots. By convention, the first field is reserved for usage types; the
second field for higher-level applications, such as the VERITAS Visual Administrator; and
the third field for local site administrators.
- v_state
- A null-terminated state field that is reserved specifically for use by usage types.
- v_pref_name
- The name of the preferred plex for use when the v_read_pol field is set to V_PREFER. This field
is derived from the v_pref_plex_rid field.
- v_start_opts
- This is an arbitrary string that is reserved for usage-type utilities. The intention is that this field be
used to store options that apply to the volume, such as for the volume start operation. This
is normally a comma-separated list of flag names, and option=value pairs. See the gen and
fsgen versions of
vxvol(1M),
for information on how this field is used by the gen and
fsgen utilities.
- v_read_pol
- The policy for selecting plexes to satisfy volume read operations. This can have one of the following
values:
- V_ROUND
- Candidate plexes are selected in sequence for each sequential volume read operation. This
is known as a round-robin approach.
- V_PREFER
- The plex named by the v_pref_name field is used if it can satisfy the read request. If the
preferred plex cannot satisfy the read request, then this policy becomes equivalent
to the round-robin policy.
- V_R_POL_SELECT
- A default policy is selected based on the current configuration of the volume. If the volume
has two or more active plexes, and exactly one of those plexes is striped, then the
striped plex is preferred; otherwise, the round-robin read policy is used.
- v_minor
- The minor number of the block and character volume devices associated with the volume record.
The volume minor number is assigned when the volume is created. This is a read-only
field. Conditions may force the actual volume device minor number to differ from the
v_minor field. This can happen in disk groups other than rootdg, if a conflict occurs. This
can also happen in the rootdg disk group if the V_PFLAG_FORCEMINOR flag is used
to force a particular value for v_minor, even if the indicated number is unavailable.
- v_uid, v_gid, and v_mode
- The user ID, group ID, and permission modes for the volume's block and character device nodes,
and for the device nodes for the associated plexes.
- v_pflag
- Flags associated with the volume that are preserved across reboots. The set of persistent flags that
can be set is:
- V_PFLAG_WRITEBACK
- The write-back-on-read-failure flag. If set, then an attempt is made to fix a read error from
a participating plex (i.e., one without the noerror flag). The method used to fix
the read error is to read from another plex associated with the volume and write
back to the plex with the read error. The read operation is then retried to verify
that the operation is fixed. This requires at least two associated, enabled,
participating, read-mode plexes.
- This is an effective way of handling device drivers that can revector blocks on write
failures, and can be used to handle the majority of media failures on many disk
drives. For this operation to be effective, the underlying device driver must not
revector blocks on read errors.
- V_PFLAG_WRITECOPY
- If set (vxmake and vxassist set this by default), then some writes to mirrored volumes that
use dirty region logging will be copied into an allocated kernel buffer before being
written to disk. The reason for doing such a copy is that write requests given to the
volume device driver can point to pages of memory that are still undergoing
change. Without doing a copy, the blocks written to each plex might be different.
If you are sure that your application does not modify pages while they are written,
or if you are certain that mirrors with differing contents do not represent a
problem, then you can turn off this flag.
- V_PFLAG_ACTIVE
- This flag is set on a reboot if the volume was open at the time of a system crash, and the
volume had been written at least once. This implies that the volume, if it is
mirrored, requires recovery to ensure consistency between plexes.
- V_PFLAG_FORCEMINOR
- If this is set, then force the setting of v_minor specified on creation of the volume record.
If this flag is not set, v_minor might be remapped to an unused value. This flag
is required to set minor numbers less than 5. This does not guarantee that the
actual volume device node will have the indicated minor number, however, if the
volume is in rootdg, then the volume will be given that minor number (if no other
volume in the disk group has that minor number) after a reboot.
- V_PFLAG_LOGTYPE
- This is a bit-mask that specifies bits in the v_pflag field that indicate the logging type for
the volume. The bits masked out by this macro can have one of the following
values:
- V_PFLAG_LOGUNDEF
- The logging type is undefined. Volumes that were created in Release 1.0 of the
Volume Manager have this type. This value is effectively identical to
V_PFLAG_NONE except that utilities are able to use the
V_PFLAG_LOGUNDEF flage as a license to default the logging type
to something else.
- V_PFLAG_LOGNONE
- No logging is performed for the volume. Even if a logging subdisk is defined for
a plex, the logging subdisk is not used.
- V_PFLAG_LOGDRL
- A dirty region log is written periodically to each log subdisk associated with an
associated, enabled plex. This log keeps track of the regions that have
changed due to I/O writes to a mirrored volume. For any write operation
to the volume, before writing the data, the regions being written are
marked dirty in the log. If a write causes a region to become dirty when
it was previously clean, the log is written to disk before the write
operation can occur.
- v_pl_num
- The number of plexes associated with the volume.
- v_update_tid
- The transaction ID of the last update to this record. This field is assigned when changes to a disk
group are committed.
- v_len
- The length of the volume. This can be set arbitrarily, even if it is longer or shorter than some or all
of the associated plexes. This value is in sectors.
- v_log_len
- The length for a volume log. For the block-change-logging log type, this value must always be 1.
However, future logging types may support larger log lengths. The length for all subdisk
logs associated with the volume must be at least this long. This value is in sectors.
- v_plex_plex_rid
- Specify the record ID of the preferred plex for the volume. This field is used only if v_read_pol is
set to V_PREFER.
- v_tutil
- An array of three null-terminated strings that can be used as scratch pads by utilities. These fields
are cleared on reboot. By convention, the first field is reserved for usage types, the second
field for higher-level applications, such as OA&M scripts and the VERITAS Visual
Administrator; and the third field for local site administrators.
- v_rec_lock
- A boolean value that is 1 if the volume is date-locked in the caller's current transaction, and 0
otherwise. This is a read-only field.
- v_data_lock
- A boolean value that is 1 if the volume is data-locked in the caller's current transaction, and 0
otherwise. This is a read-only field.
- v_kstate
- The accessibility of the volume. This field can have one of the following values:
- V_ENABLED
- The volume block device can be used, and reads and writes to the block or character volume
device are accepted.
- V_DETACHED
- The volume block device cannot be used, and reads or writes to the character device are
rejected. Volume ioctls are still usable, and the plex devices for associated plexes
can be used, within the bounds of the plex pl_kstate fields.
- V_DISABLED
- The volume cannot be used for any operations, and neither can the plex devices for any of
the associated plexes.
- This field is set to V_DISABLED after a reboot.
- v_r_all, v_r_some, v_w_all, and v_w_some
- Exception policies for the volume. These devices are classified by the following types:
- v_r_all
- Read failure on all plexes
- v_r_some
- Read failure on some plexes
- v_w_all
- Write failure on all plexes
- v_w_some
- Write failure on some plexes
- If one of these exception conditions is encountered, then the corresponding action is taken. The
possible actions are:
- V_NO_OP
- Takes no action. However, if the operation fails for all candidate plexes, then the operation
still fails.
- V_FAIL_OP
- Fails the operation, but takes no further action.
- V_DET_PL
- Detaches the plex with the failure. The operation fails only if the operation fails for all
candidate plexes.
- V_FAIL_DET_PL
- Detaches the plex with the failure and returns a failure for the operation, even if the
operation can be satisfied by another plex.
- V_DET_VOL
- Detaches the volume but does not fail the operation.
- V_FAIL_DET_VOL
- Detaches the volume and fails the operation.
- V_GEN_DET
- A higher-level error policy which detaches failing plexes. However, if detaching a
complete plex would result in no complete plexes remaining, then V_GEN_DET
detaches the volume rather than detaching the failing plexes. A complete plex is
one that has the PL_TFLAG_COMPLETE flag set in the plex pl_tflag field.
- V_GEN_DET_SPARSE
- A higher-level error policy which detaches failing plexes. However, if detaching a plex
results in no complete plexes remaining, then V_GEN_DET_SPARSE leaves
exactly one complete plex enabled, and detaches all incomplete plexes that have
volume blocks mapped to subdisks in the region of the failure. This policy allows
the volume to continue operating on a failing plex, and does not disable mirrored
regions that are unaffected by the failing operation.
- In the case of a logging volume, the volume is detached if a write failure occurs to all
enabled log subdisks associated with the volume.
- V_GEN_FAIL
- Detaches the failing plexes, and the volume, and returns a failure for the operation. This
policy can be used by applications that wish to make decisions about changing the
Volume Manager configuration based on failures. The detached state of a plex
can be used as an indication of which plexes failed, and making the volume
detached prevents future I/Os from succeeding until the problem is resolved.
- V_GEN_DET2
- This operates exactly like the V_GEN_DET error policy, except that it detaches the
volume if the number of complete plexes would drop below two. This ensures that
a volume is either mirrored to at least two plexes, or is non-operational until the
situation is repaired.
- Not all plexes are taken into account in the exception policy selection or actions. A plex is ignored
under any of the following conditions:
- The plex is not enabled.
- The plex does not have a read or write mode appropriate for the operation.
- The plex has the PL_PFLAG_NOERROR flag set.
- The plex does not have mapped subdisk blocks that are appropriate for the range of the requested operation.
- The exception policies are normally set implicitly by the operational utilities. The utilities provided by VERITAS set all the exception policies to V_GEN_DET_SPARSE and do not provide a means for changing the policies to something else.
- v_lasterr
- A sequence number for the last I/O error to be encountered on the volume. This is a read-only field.
- v_tflag
- A bitmask of flags that is cleared after a reboot. Flags defined in this field are:
- V_TFLAG_RWBACK
- A flag that can be turned on to request read/writeback mode. In read/writeback mode, a
read request for a mirrored volume will write back to all other plexes the resulting
data from the read. The operation is affected by the v_rwback_offset field. This
mode is intended for volume recovery operations.
- V_TFLAG_KRWBACK
- This is a status flag which indicates that the read/writeback mode operation is still in effect.
This flag is set when V_TFLAG_RWBACK is set. If the read/writeback offset
(see v_rwback_offset) reaches the end of the volume, then the kernel will turn off
this flag.
- VK_OPEN
- A status flag that indicates that the volume device that corresponds to the volume record is
open or mounted as a file system.
- V_TFLAG_LOGGING
- A status flag which indicates the volume has a logging type of
VOL_PFLAG_LOGBLKNO, is enabled, and has at least one enabled,
associated plex with an enabled log subdisk. This flag is not cleared when
exception policies are invoked that detach a volume or its plexes.
- V_TFLAG_INVALID
- An error has rendered the volume unusable. The volume cannot be started.
- v_log_serial_lo and v_log_serial_hi
- These values, taken together, yield a unique monotonically increasing value that is changed for
every log write that occurs to a volume with logging enabled. These two numbers are
cleared by a reboot, but are normally set explicitly by a vxvol start operation. The value
in v_log_serial_lo is incremented by one for every log write. When the value would
surpass LONG_MAX (normally 2147483647 for 32-bit machines), v_log_serial_lo is set
to zero and v_log_serial_hi is incremented by one. Thus, on 32-bit machines,
v_log_serial_lo and v_log_serial_hi represent the low 31 bits and the high 31 bits,
respectively, of a 62 bit number.
- Unlike all other flags, the values of the log serial number fields cannot always be trusted within a
transaction. The reason for this is that data-locks are not obtained by
vxconfigd until after a utility has completely described a transaction for vxconfigd to
transmit to the kernel. Other fields that can be changed by the kernel are checked at the
time of a vol_commit to ensure that the fields haven't changed, and if any kernel-
modifiable fields have changed since the corresponding vol_trans call, then the utility is
asked to retry the transaction.
- However, a volume with significant I/O activity is likely to change the value of the serial number
fields often enough that such volumes may have to be retried an unacceptable number of
times, so these fields are not checked.
- Utilities must be prepared to ensure that volume logs are in a quiescent state (normally by setting the
volume to V_DETACHED or by disabling logging) before using the value of a log within
a transaction. The existing utility set uses the log serial number fields only to set the serial
number for a volume.
- v_bdev, v_cdev
- The device numbers for the volume block device node. Normally, these are computed from the
v_minor number. However, in cases of collision, they may have different minor numbers.
- v_iosize
- The largest sector size of any disk assicated (through a subdisk) with the volume. At the present
time, only one sector size (normally 512 bytes) is supported, so this field will always match
the single system sector size.
- v_rwback_offset
- When read/writeback mode is turned on, this field is loaded into the kernel as the current read/
writeback offset pointer. Reads then occur before this offset into the volume will not
invoke read/writeback recovery. If a read occurs on the boundary, then then the kernel will
increase the pointer to the end of that read, after a successful result from the operation. This
automatically-increasing pointer causes the degradation from the read/writeback mode to
decrease as volume recovery progresses.
References
vxconfigd(1M),
vxintro(1M),
vxiod(1M),
vxmake(1M),
vxvol(1M),
plexrec(4),
sdrec(4)
© 1997 The Santa Cruz Operation, Inc. All rights reserved.