skip book previous and next navigation links
go up to top of book: HP OpenVMS System Manager's Manual, Volume 1:... HP OpenVMS System Manager's Manual, Volume 1:...
go to beginning of chapter: Managing Storage Media Managing Storage Media
go to previous page: Using the Analyze/Disk_Structure Utility to Check and Repair Disks Using the Analyze/Disk_Structure Utility to Check and Repair...
go to next page: Using Interrupt Priority Level C (IPC)Using Interrupt Priority Level C (IPC)
end of book navigation links

Using Mount Verification for Recovery   



Mount verification is a recovery mechanism for disk and tape operations. If a device goes off line or is write-locked while mount verification is enabled, you can correct the problem and continue the operation.

Without mount verification, a write lock or offline error causes a volume to be dismounted immediately. All outstanding I/O to the volume is canceled, and all open files on the volume are closed. Any data not yet written to the volume is lost.

You can also use mount verification to perform switched path on multipath fibre channel or SCSI disk or tape devices. See "How OpenVMS Performs Multipath Failover" in Guidelines for OpenVMS Cluster Configurations.

Understanding Mount Verification  

When the system or a user attempts to access a device after it has gone off line, mount verification is initiated. Usually a device goes off line as the result of a hardware or user error. Once a device is off line, the hardware (and for some disks, the software) marks the disk or tape as "invalid," and I/O requests for that device fail.

As long as mount verification is enabled, the following operations occur:

  1. The software marks the volume to indicate that it is undergoing mount verification.
  2. The software stalls all I/O operations to the disk or tape until the problem is corrected.
  3. The operator communication manager (OPCOM) issues a message to operators enabled for DISKS and DEVICES or TAPES and DEVICES. The message announces the unavailability of the disk or tape in the following format:
    %%%%%%%%%%% OPCOM, <dd-mmm-yyyy hh:mm:ss.cc> %%%%%%%%%%%
    Device <device-name> is offline.               
    Mount verification in progress.

When a device goes off line or is write-locked, mount verification sends two messages:

The second message is a form of insurance in cases in which OPCOM is unavailable. For example, if the system disk undergoes mount verification or if OPCOM is not present on a system, you at least receive the messages with the %SYSTEM-I-MOUNTVER prefix. Under normal circumstances, the operator terminal receives both messages, with the %SYSTEM-I-MOUNTVER message arriving first.

These messages notify you of the problem, and allow you to correct the problem and recover the operation. When a pending mount verification is canceled by timing out, OPCOM prints a message in the following format:

%%%%%%%%%%% OPCOM, <dd-mmm-yyyy hh:mm:ss.cc> %%%%%%%%%%%
Mount verification aborted for device <device-name>.
After a mount verification times out, all pending and future I/O requests to the volume fail. You must dismount and remount the disk before users can access it again.
NoteMount verification caused by a write-lock error does not time out.

Mount Verification and Write-Locking

Suppose, for example, that a volume is mounted on a drive with write-lock off, and someone toggles the WRITE LOCK switch. If mount verification is enabled for the volume, the volume enters mount verification, and all I/O operations to the volume are suspended until you recover the operation, as explained in Recovering from Write-Lock Errors.

At mount time, if the system detects that the caches were not written back the last time the volume was used, the system automatically rebuilds the file information by scanning the contents of the volume. However, files being written at the time of the improper dismount might be partially or entirely lost. See Using the Analyze/Disk_Structure Utility to Check and Repair Disks for details about analyzing and repairing these problems.

With the mount verification feature of disk and tape handling, users are generally unaware that a mounted disk or tape has gone off line and returned on line, or in some other way has become unreachable and then restored.

Using Mount Verification  

The following sections explain how to perform these tasks:

Task Section
Enable and disable mount verification
Enabling Mount Verification
Control timeout periods for mount verification
Controlling Timeout Periods for Mount Verification
Recover from offline errors
Recovering from Offline Errors
Recover from write-lock errors
Recovering from Write-Lock Errors
Cancel mount verification using the DISMOUNT command
Canceling Mount Verification
Control the number of mount verification messages
Controlling Mount Verification Messages

Enabling Mount Verification  

Mount verification is enabled by default when you mount a disk or tape. To disable mount verification, you must specify /NOMOUNT_VERIFICATION when you mount a disk or tape.

Note that this feature applies to standard mounted tapes, foreign mounted tapes, and Files-11 disks.

Controlling Timeout Periods for Mount Verification  

You can control the amount of time (in seconds) that is allowed for a mount verification to complete before it is automatically canceled. The MVTIMEOUT system parameter for disks and the TAPE_MVTIMEOUT system parameter for tapes define the time (in seconds) that is allowed for a pending mount verification to complete before it is automatically canceled.

The default time limit for tapes is 600 seconds (10 minutes); for disks, it is 3600 seconds (1 hour). (Refer to the HP OpenVMS System Management Utilities Reference Manual for more information about system parameters.)

Always set either parameter to a reasonable value for the typical operations at your site. Note that resetting the value of the parameter does not affect a mount verification that is currently in progress.

Recovering from Offline Errors  

When a mounted disk or tape volume goes off line while mount verification is enabled, you can try to recover, or you can terminate the mount request. The following options are available:

If you successfully put the device back on line, the mount verification software that polls the disk or tape drive begins verification in the following sequence of steps:

  1. The system checks to see that the currently mounted disk or tape has the same identification as the previously mounted volume. In this way, mount verification confirms that this is the same disk or tape that was previously mounted and no switching has occurred.

    If the drive contains the wrong volume, OPCOM issues a message in this format:
    %%%%%%%%%%% OPCOM, <dd-mmm-yyyy hh:mm:ss.cc> %%%%%%%%%%%
    Device <device-name> contains the wrong volume.
    Mount verification in progress.
  2. Once mount verification completes, the disk is marked as valid, and OPCOM issues a message in the following format:
    %%%%%%%%%%% OPCOM, <dd-mmm-yyyy hh:mm:ss.cc> %%%%%%%%%%%
    Mount verification completed for device .
  3. I/O operations to the disk or tape proceed, as shown in the following example:
    %%%%%%%%%%% OPCOM, 28-MAY-2000 11:54:54.12 %%%%%%%%%%%
    Device DUA0: is offline.
    Mount verification in progress.
     
    %%%%%%%%%%% OPCOM, 28-MAY-2000 11:57:34.22 %%%%%%%%%%%
    Mount verification completed for device DUA0:.
    In this example, the message from OPCOM informs the operator that device DUA0: went off line and mount verification was initiated. The operator finds that the drive was accidentally powered down and successfully powers it up again.

    The last message in the example indicates that mount verification is satisfied that the same volume is on the drive as before the error. All I/O operations to the volume resume.

Recovering from Write-Lock Errors  

Devices become write-locked when a hardware or user error occurs while a disk or a tape volume is mounted for a write operation. For example, if a disk is write-locked or a tape is missing a write ring, the hardware generates an error. As soon as the software discovers that the disk or tape is write-locked (for example, when an I/O operation fails with a write-lock error), mount verification begins.

OPCOM issues a message in the following format to the operators enabled for DISKS and DEVICES or TAPES and DEVICES, announcing the unavailability of the disk or tape:

%%%%%%%%%%%% OPCOM, <dd-mmm-yyyy hh:mm:ss.cc> %%%%%%%%%%%
Device <device-name> has been write-locked.
Mount verification in progress.
You can either recover the operation or terminate mount verification. Your options include the following ones:

Once the mount verification software determines that the volume is in a write-enabled state, I/O operations to the tape or disk resume with no further messages.

Canceling Mount Verification  

You can cancel a mount verification request in one of the following ways:

The following section describes the first method, using the DISMOUNT command, in more detail. See Canceling Mount Verification for details about using the last method, IPC, to cancel mount verification.

Using the DISMOUNT Command

To dismount a volume:

  1. Log in at another terminal, or use any logged-in terminal that has access to the volume. (It does not need to be an operator terminal.)
  2. Enter the DISMOUNT/ABORT command for the volume. (To use the /ABORT qualifier with a volume that is not mounted group or system, you must have volume ownership or the user privilege VOLPRO.)

    If your system is in an OpenVMS Cluster environment, also specify the /CLUSTER qualifier.

    When you cancel a pending mount verification by dismounting the volume, OPCOM issues a message in the following format:
    %%%%%%%%%%%% OPCOM, <dd-mmm-yyyy hh:mm:ss.cc> %%%%%%%%%%%
    Mount verification aborted for device <device-name>.
    If you do not have access to the volume, you receive an error message. You can try again if you can find an appropriate process to use. If your process hangs, the system file ACP is hung, and you cannot use this technique to cancel mount verification.
  3. When the cancellation is complete, remove the volume from the drive.

Controlling Mount Verification Messages  

In a Storage Area Network (SAN), mount verification takes place for a variety of reasons, including:

Mount verification now suppresses the messages that were previously displayed for mount verification events from which devices immediately recovered. These messages unduly alarmed some customers.

The number of messages logged to the operator's log is now controlled by two system parameters:

MVSUPMSG_NUM, which specifies a number of mount verification messages

MVSUPMSG_INTVL, which specifies a duration in seconds

If the number of mount verification messages that have been suppressed for a given device meets or exceeds the number specified by MVSUPMSG_NUM within the time specified by MVSUPMSG_INTVL, then an OPCOM message is displayed, as shown in the following examples:

%SYSTEM-I-MOUNTVER, $1$DGA9999: 5  Mount verification messages have been suppressed in past 51 seconds.%%%%%%%%%%%  OPCOM  18-MAY-2003 13:50:09.72  %%%%%%%%%%% $1$DGA9999: 5 Mount verification messages have been suppressed in past 51 seconds.*********************************************************************************************%SYSTEM-I-MOUNTVER, $1$DGA9999: 5  Mount verification messages have been suppressed in past 3 seconds.%%%%%%%%%%%  OPCOM  18-MAY-2003 13:50:13.17  %%%%%%%%%%% $1$DGA9999: 5 Mount verification messages have been suppressed in past 3 seconds.

Customers who prefer prior behavior or who would like to increase or decrease the number of messages that are logged can adjust the system parameter settings.

For more information about these new system parameters, refer to the HP OpenVMS System Management Utilities Reference Manual.


go to previous page: Using the Analyze/Disk_Structure Utility to Check and Repair Disks Using the Analyze/Disk_Structure Utility to Check and Repair...
go to next page: Using Interrupt Priority Level C (IPC)Using Interrupt Priority Level C (IPC)