Multiple controllers of
the same type (except RB730), for example, more than one MBA or RK611
can be used on the system
Multiple disk drives per
controller (the exact number depends on the controller)
Different types of disk
drives on a single controller
Static dual porting (MBA
drives only)
Overlapped seeks (except
RL02, RX01, RX02, and TU58)
Data checks on a per-request,
per-file, or per-volume basis (except RX01 and RX02)
Full recovery from power
failure for online disk drives with volumes mounted
Extensive error recovery
algorithms, such as error code correction and offset (except RB02,
RL02, RX01, RX02, and TU58); for DSA disks, these algorithms are implemented
in the controller
Dynamic, as well as static,
bad block handling
Logging of device errors
in a file that can be displayed by field service personnel or customer
personnel
Online diagnostic support
for drive level diagnostics
Multiple-block, noncontiguous,
virtual I/O operations at the driver level
Logical-to-physical sector
translation (RX01 and RX02 only)
The following sections describe these features
in greater detail.
2.1.1 Data Check
A data check is made after successful completion of a
read or write operation and, except for the TU58, compares the data
in memory with the data on disk to make sure they match.
Disk drivers support data checks at the following
levels:
Per request—You
can specify the data check function modifier (IO$M_DATACHECK) on a read logical
block, write logical block, read virtual block, write virtual block,
read physical block, or write physical block operation. IO$M_DATACHECK
is not supported for the RX01 and RX01 drivers.
Per volume—You
can specify the characteristics “data check all reads”
and “data check all writes” when the volume is mounted.
The HP OpenVMS DCL Dictionary describes volume mounting and dismounting.
The HP OpenVMS System Services Reference Manual describes the Mount
Volume ($MOUNT) and Dismount Volume ($DISMOUNT) system services.
Per file—You can
specify the file access attributes “data check on read”
and “data check on write.” File access attributes are specified when the file is
accessed. Chapter 1 of this
manual and the OpenVMS Record Management Services Reference Manual
describe file access.
Offset recovery
is performed during a data check, but error code correction (ECC)
is not performed (see “Error Recovery”). For example, if a read operation
is performed and an ECC correction is applied, the data check would
fail even though the data in memory is correct. In this case, the
driver returns a status code indicating that the operation was completed
successfully, but the data check could not be performed because of
an ECC correction.
Data checks on read operations are extremely rare,
and you can either accept the data as is, treat the ECC correction
as an error, or accept the data but immediately move it to another
area on the disk volume.
A data check operation directed to a TU58 does
not compare the data in memory with the data on tape. Instead,
either a read check or a write check operation is performed (see “Read” and “Write”).
2.1.2 Effects of a Failure During an I/O Write Operation
The operating system ensures that when an I/O
write operation returns a successful completion status, the data is
available on the disk or tape media. Applications that must guarantee
the successful completion of a write operation can verify that the
data is on the media by specifying the data check function modifier
IO$M_DATACHECK. Note that the IO$M_DATACHECK data check function,
which compares the data in memory with the data on disk, affects performance
because the function incurs the overhead of an additional read operation
to the media.
If a system failure occurs while a multiple-block
write operation is in progress, the operating system does not guarantee
the successful completion of the write operation. (OpenVMS does guarantee
single-block write operations to DSA drives.) When a failure interrupts
a write operation, the data may be left in any one of the following
conditions:
The new data is written
completely to the disk blocks on the media, but a completion status
was not returned before the failure.
The new data is partially
written to the media so that some of the disk blocks involved in the
I/O contain the data from the write operation in progress, and the
remainder of the blocks contain the data that was present before the
write operation.
The new data was never
written to the disk blocks on the media.
To guarantee that a write operation either finishes
successfully or (in the event of failure) is redone or rolled back
as if it were never started, use additional techniques to ensure data
correctness and recovery. For example, using database journaling and
recovery techniques allows applications to recover automatically from
failures such as the following:
Permanent loss of the
path between a CPU data buffer containing the data being written and
the disk being written to during a multiple-block I/O operation. Communication
path loss can occur due to node or controller failure or a failure
of node-to-node communications.
Failure of a CPU (such
as a system failure, system halt, power failure, or system shutdown)
during a multiple-block write operation.
Mistaken deletion of a
file.
Corruption of file system
pointers.
File corruption due to
a software error or incomplete bucket write operation to an indexed
file.
Cancellation of an in-progress
multiple-block write operation.
2.1.3 Error Recovery
Error recovery in the
operating system is aimed at performing all possible operations to
complete an I/O operation successfully. Error recovery operations
fall into the following categories:
Handling special conditions
such as power failure and interrupt timeout.
Retrying nonfatal controller
and drive errors. For DSA and SCSI disks, this function is implemented
by the controller.
Applying error correction
information (not applicable for RB02, RL02, RX01, RX02, and TU58 drives).
For DSA and SCSI disks, error correction is implemented by the controller.
Offsetting read heads
to try to obtain a stronger recorded signal (not applicable for RB02,
RL02, RB80, RM80, RX01, RX02, and TU58 drives). For DSA and SCSI disks,
this function is implemented by the controller.
The error recovery algorithm uses a combination
of these four types of error recovery operations to complete an I/O
operation:
Power failure recovery
consists of waiting for mounted drives to spin up and come on line,
followed by reexecution of the I/O operation that was in progress
at the time of the power failure.
Device timeout is treated
as a nonfatal error. The operation that was in progress when the timeout
occurred is reexecuted up to eight times before a timeout error is
returned.
Nonfatal controller/drive
errors are executed up to eight times before a fatal error is returned.
All normal error recovery
procedures (nonspecial conditions) can be inhibited by specifying the inhibit retry function modifier (IO$M_INHRETRY).
If any error occurs and this modifier is specified, the virtual, logical,
or physical I/O operation is immediately terminated, and a failure
status is returned. This modifier has no effect on power recovery
and timeout recovery.
2.1.4 SCSI Disk Class Driver
Although SCSI disks
do not conform to DSA, they do support the following error recovery
features:
Static and dynamic bad
block replacement (BBR)
Error correction code
(ECC)
Reexecution of read or
write operations within the SCSI drive
Reexecution of read or
write operations by the SCSI disk class driver
All SCSI disks supplied by HP implement the REASSIGN
BLOCKS command, which relocates data for a specific logical block
to a different physical location on the disk. The SCSI disk class
driver reassigns the block in the following instances: (1) when the
retry threshold is exceeded during an attempt to read or write a block
of data on the disk or (2) when an irrecoverable error occurs during
a write operation.
Unlike DSA, there is no forced error flag in SCSI.
Blocks that produce irrecoverable errors during read operations are
not reassigned in order to prevent undetected loss of user data. Instead,
the SCSI disk class driver returns the SS$_PARITY status whenever
a read operation results in an irrecoverable error.
2.1.5 Audio Extensions to the SCSI Disk Class Driver
The operating system provides audio functionality through the
SCSI disk class driver. The SCSI disk class driver provides an interface
by which the audio commands can be issued to SCSI devices. These commands
can be issued through the QIO function call. This functionality is
available for devices, such as CD-ROMs that have audio capability.
The IO$_AUDIO function code allows the SCSI disk class driver
to process the SCSI audio commands. An Audio Control Block (AUCB)
must be defined for a specific SCSI audio command. This AUCB provides
the SCSI disk class driver with command-specific arguments and control
information. An application program must use the IO$_AUDIO function
code and provide the AUCB for the SCSI driver to process the audio
commands.