[an error occurred while processing this directive]

HP OpenVMS Systems Documentation

Content starts here Merge Operations
HP Volume Shadowing for OpenVMS: OpenVMS Version 8.4 > Chapter 6 Ensuring Shadow Set Consistency

Merge Operations

The purpose of either a full merge or a minimerge operation is to compare data on shadow set members and to ensure that all of them contain identical data on every logical block (each block is identified by its logical block number [LBN]). A full merge or minimerge operation is initiated if either of the following events occurs:

  • A system failure results in the possibility of incomplete writes.

    For example, if a write request is made to a shadow set but the system fails before a completion status is returned from all the shadow set members, it is possible that:

    • All members might contain the new data.

    • All members might contain the old data.

    • Some members might contain new data and others might contain old data.

    The exact timing of the failure during the original write request defines which of these three scenarios results. When the system recovers, Volume Shadowing for OpenVMS ensures that corresponding LBNs on each shadow set member contain the same data (old or new). It is the responsibility of the application to determine if the data is consistent from its point of view. The volume might contain the data from the last write request or it might not, depending on when the failure occurred. The application should be designed to function properly in both cases.

  • If a shadow set enters mount verification with outstanding write I/O in the driver’s internal queue, and the problem is not corrected before mount verification times out, the systems on which the timeout occurred require other systems that have the shadow set mounted to put the shadow set into a merge transient state.

    For example, if the shadow set were mounted on eight systems and mount verification timed out on two of them, those two systems would each check their internal queue for write I/O. If one were found, the shadow set would enter a merge transient state.

The merge operation is managed by one of the OpenVMS systems that has the shadow set mounted. The members of a shadow set are physically compared to each other to ensure that they contain the same data. This is done by performing a block-by-block comparison of the entire volume. As the merge proceeds, any blocks that are different are made the same — either both old or new —- by means of a copy operation. Because the shadowing software does not know which member contains newer data, any full member can be the source member of the merge operation.

A full merge operation can be a very lengthy procedure. During the operation, application I/O continues but at a slower rate.

A minimerge operation can be significantly faster. By using information about write operations that were logged in volatile controller storage, the minimerge is able to merge only those areas of the shadow set where write activity was known to have occurred. This avoids the need for the entire volume scan that is required by full merge operations, thus reducing consumption of system I/O resources.

The shadowing software always selects one member as a logical master for any merge operation, across the OpenVMS Cluster. Any difference in data is resolved by a propagation of the information from the merge master to all the other members.

The system responsible for doing the merge operation on a given shadow set, updates the merge fence for this shadow set after a range of LBNs is reconciled. This fence “proceeds” across the disk and separates the merged and unmerged portions of the shadow set.

Application read I/O requests to the merged side of the fence can be satisfied by any source member of the shadow set. Application read I/O requests to the unmerged side of the fence are also satisfied by any source member of the shadow set; however, any potential data differences---discovered by doing a data compare operation---are corrected on all members of the shadow set before returning the data to the user or application that requested it.

This method of dynamic correction of data inconsistencies during read requests allows a shadow set member to fail at any point during the merge operation without impacting data availability.

Volume Shadowing for OpenVMS supports both assisted and unassisted merge operations in the same cluster. Whenever you create a shadow set, add members to an existing shadow set, or boot a system, the shadowing software reevaluates each device in the changed configuration to determine whether it is capable of supporting the merge assist.

Unassisted Merge Operations

For systems running software earlier than OpenVMS Version 5.5–2, the merge operation is performed by the system and is known as an unassisted merge operation.

To ensure minimal impact on user I/O requests, volume shadowing implements a mechanism that causes the merge operation to give priority to user and application I/O requests.

The shadow server process performs merge operations as a background process, ensuring that when failures occur, they minimally impact user I/O. A side effect of this is that unassisted merge operations can often take an extended period of time to complete, depending on user I/O rates. Also, if another node fails before a merge completes, the current merge is abandoned and a new one is initiated from the beginning.

Note that data availability and integrity are fully preserved during merge operations regardless of their duration. All shadow set members contain equally valid data.

Assisted Merge Operations (Alpha)

Starting with OpenVMS Version 5.5–2, the merge operation includes enhancements for shadow set members that are configured on controllers that implement assisted merge capabilities. The assisted merge operation is also referred to as a minimerge. The minimerge feature significantly reduces the amount of time needed to perform merge operations. Usually, the minimerge completes in a few minutes. HSC and HSJ controllers support minimerge. Host-based minimerge is supported on OpenVMS Alpha Version 7.3-2 and on OpenVMS Version 8.2 for OpenVMS Integrity servers and for OpenVMS Alpha. For more information, see Chapter 8.

By using information about write operations that were logged in controller memory, the minimerge is able to merge only those areas of the shadow set where write activity was known to have been in progress. This avoids the need for the total read and compare scans required by unassisted merge operations, thus reducing consumption of system I/O resources.

Controller-based write logs contain information about exactly which LBNs in the shadow set had write I/O requests outstanding (from a failed node). The node that performs the assisted merge operation uses the write logs to merge those LBNs that may be inconsistent across the shadow set. No controller-based write logs are maintained for a one member shadow set. No controller-based write logs are maintained if only one OpenVMS system has the shadow set mounted.

NOTE: The shadowing software does not automatically enable a minimerge on a system disk because of the requirement to consolidate crash dump files on a nonsystem disk.

Dump off system disk (DOSD) is supported on OpenVMS Integrity servers starting with OpenVMS Version 8.2 and on OpenVMS Alpha starting on OpenVMS Alpha Version 7.1 If DOSD is enabled, the system disk can be minimerged.

The minimerge operation is enabled on nodes running OpenVMS Version 5.5–2 or later. Volume shadowing automatically enables the minimerge if the controllers involved in accessing the physical members of the shadow set support it. See the HP Volume Shadowing for OpenVMS Software Product Description (SPD 27.29.xx ) for a list of supported controllers. Note that minimerge operations are possible even when shadow set members are connected to different controllers. This is because write log entries are maintained on a per controller basis for each shadow set member.

Volume Shadowing for OpenVMS automatically disables minimerges if:

  • The shadow set is mounted on a cluster node that is running an OpenVMS release earlier than Version 5.5–2.

  • A shadow set member is mounted on a controller running a version of firmware that does not support minimerge.

  • A shadow set member is mounted on a controller that has performance assists disabled.

  • If any node in the cluster, with a shadow set mounted, is running a version of Volume Shadowing that has minimerge disabled.

  • The shadow set is mounted on a standalone system. (Minimerge operations are not enabled on standalone systems.)

  • The shadow set is mounted on only one node in the OpenVMS Cluster.

The following transient conditions can also cause a minimerge operation to be disabled:

  • If an unassisted merge operation is already in progress when a node fails.

    In this situation, the shadowing software cannot interrupt the unassisted merge operation with a minimerge.

  • When not enough write log entries are available in the controllers.

    The number of write log entries available is determined by controller capacity. The shadowing software dynamically determines when there are enough entries to maintain write I/O information successfully. If the number of available write log entries is too low, shadowing temporarily disables logging for that shadow set, and it returns existing available entries on this and every node in the cluster. After some time has passed, shadowing attempts to re-enable write logging on this shadow set.

    A controller retains a write log entry for each write I/O request until that entry is deleted by shadowing, or the controller is restarted.

    A multiple-unit controller shares its write log entries among multiple disks. This pool of write log entries is managed by the shadowing software. If a controller runs out of write log entries, shadowing disables minimerges and performs an unassisted merge operation, should a node leave the cluster without first dismounting the shadow set. Note that write log exhaustion does not typically occur with disks on which the write logs are not shared.

  • When the controller write logs become inaccessible for one of the following reasons, a minimerge operation is not possible.

    • Controller failure causes write logs to be lost or deleted.

    • A device that is dual ported to multiple controllers fails over to its secondary controller. (If the secondary controller is capable of maintaining write logs, the minimerge operations are reestablished quickly.)