[an error occurred while processing this directive]

HP OpenVMS Systems Documentation

Content starts here

Guidelines for OpenVMS Cluster Configurations


Previous Contents Index

6.7.6 Path Polling

When SCSI multipath support is in effect, the system periodically polls all the I/O paths from each host adapter to each HSZ or HSG controller or to an MDR to determine the status of each I/O path. If the system detects any changes to a path, it outputs a message, similar to the following messages, to the console and to the operator's log:


All multipath devices on path PKB0.5 are either disabled or not reachable.

or


At least one multipath device on path PKB0.5 is enabled and reachable.

If all the devices on a path are removed, a path failure is reported. The path from the host to the HSx controller may still function, but this cannot be determined when there are no devices to poll.

You can turn polling on or off with the following command:


SET DEVICE device/[NO]POLL/PATH=path-identifier

Turning off polling for a path that will be out of service for a prolonged period is useful because it can reduce system overhead.

6.7.7 Switching Current Paths Manually

You can switch a device's current path manually using the SET DEVICE command with the /SWITCH qualifier. The most common reason for doing this is to balance the aggregate I/O load across multiple HSx controller modules, MDRs, and buses.

The command syntax for switching the current path is:


SET DEVICE device-name/SWITCH/PATH=path-identifier

This command requires the OPER privilege. Additionally, if the device is currently allocated by another process, as tape devices often are, the SHARE privilege is needed.

The following command switches the path of device $2$DKA502 to an MSCP served path.


$ SET DEVICE $2$DKA502/SWITCH/PATH=MSCP

Note that this command initiates the process of switching the path and then returns to the DCL prompt immediately. A delay may occur between when the DCL prompt reappears and when the path switch is complete.

A manual path switch of a mounted device takes place within mount verification, which is triggered by the path switch command. It is accompanied by the usual mount verification messages and a path switch message, as shown in Example 6-1.

Example 6-1 Messages Resulting from Manual Path Switch

 %%%%%%%%%%%  OPCOM  15-JUN-2001 09:04:23.05  %%%%%%%%%%%
 Device $1$DGA23: (H2OFRD PGA) is offline.
 Mount verification is in progress.

 %%%%%%%%%%%  OPCOM  15-JUN-2001 09:04:25.76  %%%%%%%%%%%
 09:04:25.76 Multipath access to device $1$DGA23: has been manually switched
 from path PGA0.5000-1FE1-0000-0D11 to path PGA0.5000-1FE1-0000-0D14

 %%%%%%%%%%%  OPCOM  15-JUN-2001 09:04:25.79  %%%%%%%%%%%
 Mount verification has completed for device $1$DGA23: (H2OFRD PGA)

You can check for completion of a path switch by issuing the SHOW DEVICE/FULL command, or the SHOW DEVICE/MULTIPATH command.

Note that if the path that is designated in a manual path switch fails during the switch operation, then automatic path switching takes over. This can result in a switch to a path different from the one designated in the command.

If a manual path switch causes a logical unit to switch from one HSG80 controller to another controller, then the command can affect other nodes in the cluster. These nodes will experience a mount verification on their current path, causing an automatic switch to a path on the other HSG80 controller. Example 6-2 shows the messages that indicate this event.

Example 6-2 Messages Displayed When Other Nodes Detect a Path Switch

 %%%%%%%%%%%  OPCOM  15-JUN-2001 09:04:26.48  %%%%%%%%%%%
 Device $1$DGA23: (WILD8 PGA, H20FRD) is offline.
 Mount verification is in progress.

 %%%%%%%%%%%  OPCOM  15-JUN-2001 09:04:26.91  %%%%%%%%%%%
 09:04:29.91 Multipath access to device $1$DGA23: has been auto switched from
 path PGA0.5000-1FE1-0000-0D12 (WILD8) to path PGA0.5000-1FE1-0000-0D13 (WILD8)

 %%%%%%%%%%%  OPCOM  15-JUN-2001 09:04:27.12  %%%%%%%%%%%
 Mount verification has completed for device $1$DGA23: (WILD8 PGA, H20FRD)

The WILD8 node name is displayed for each path because each path is a direct path on node WILD8. The node name field in the mount verification in progress and completed messages shows both a local path and an MSCP alternative. In this example, the WILD8 PGA, H20FRD name shows that a local PGA path on WILD8 is being used and that an MSCP path via node H20FRD is an alternative.

6.7.8 Path Selection by OpenVMS

The selection of the current path to a multipath device is determined by the device type as well as by the event that triggered path selection.

Path Selection for Initial Configuration at System Startup

When a new path to a multipath disk (DG, DK) or tape device (MG) is configured, the path chosen automatically as the current path is the direct path with the fewest devices. No operator messages are displayed when this occurs. (This type of path selection is introduced in OpenVMS Alpha Version 7.3-1.) A DG, DK, or MG device is eligible for this type of path selection until the device's first use after a system boot or until a manual path switch is performed by means of the SET DEVICE/SWITCH command.

When a new path to a generic multipath SCSI device (GG, GK) is configured, the path chosen automatically as the current path is the first path discovered, which is also known as the primary path. For GG and GK devices, the primary path remains the current path even as new paths are configured. GG and GK devices are typically the console LUNs for HSG or HSV controller LUNs or for tape media robots.

Path Selection When Mounting a Disk Device

The current path to a multipath disk device can change as a result of a MOUNT command. The I/O performed by the MOUNT command triggers a search for a direct path that does not require the disk device to fail over from one HSx controller to another.

The path selection initiated by a MOUNT command on a DG or DK disk device proceeds as follows:

  1. If the current path is a direct path and access to the device on this path does not require a controller failover, the current path is used.
  2. If the current path is an MSCP path and it was selected by a manual path switch command, the current path is used.
  3. All direct paths are checked, starting with the path that is the current path for the fewest devices. The direct paths are considered in order of increasing use as a current path for other devices. If a path is found on which access to the device does not require a controller failover, that path is selected. If the selected path is not the current path, an automatic path switch is performed and an OPCOM message is issued.
  4. All direct paths are tried, starting with the path that is the current path for the fewest devices. The direct paths are considered in order of increasing use as a current path for other devices. If necessary, an attempt is made to fail over the device to the HSx controller on the selected path. If the selected path is not the current path, an automatic path switch is performed and an OPCOM message is issued.
  5. The MSCP served path is tried. If the MSCP path is not the current path, an automatic path switch is performed and an OPCOM message is issued.

The MOUNT utility might trigger this path selection algorithm a number of times until a working path is found. The exact number of retries depends on both the time elapsed for the prior attempts and the qualifiers specified with the MOUNT command.

This path selection process, introduced in OpenVMS Alpha Version 7.3-1, has the following benefits:

  • Minimizes the disruption on other hosts in the cluster.
  • Tends to preserve any static load balancing that has been manually set up on other nodes in the cluster.
  • Enables the use of HSx console commands to set up an initial default distribution of devices between the two HSx controllers.
  • Tends to balance the use of available paths from this host to the disk devices.
  • Prefers direct paths over MSCP served paths.

Note that this selection process allows devices to be distributed between the two HSx controllers. You can accomplish this by using HSx console commands, such as the following:


HSG> SET UNIT PREFERRED_PATH=THIS_CONTROLLER

HSG> SET UNIT PREFERRED_PATH=OTHER_CONTROLLER

In addition, you can use the DCL commands for manual path switching described in Section 6.7.7, to select a different host bus adapter or a different port on the same HSx controller, or to force the device to fail over to a different HSx controller.

Path Selection When Mounting Tape Drive Device

Support for multipath tape drives and this type of path selection was introduced in OpenVMS Alpha Version 7.3-1. Path selection when the MOUNT command is issued differs somewhat between multipath tape drives and disk devices for several reasons:

  • Tape drives are not concurrently shareable by multiple hosts.
  • Tape drives do not present the same concerns as disks do for disrupting I/O being performed by another host.
  • There is no failover between direct and MSCP served paths to multipath tape devices.

The path selection initiated by a MOUNT command on an MG tape drive device proceeds as follows:

  1. The current path is used if possible, even if a controller failover is required.
  2. The direct paths are checked, starting with the path that is the current path for the fewest devices. The direct paths are considered in order of increasing use as a current path for other devices. If a path is found on which access to the device does not require a controller failover, that path is selected. If the selected path is not the current path, an automatic path switch is performed and an OPCOM message is issued.
  3. The direct paths are checked again, starting with the path that is the current path for the fewest devices. The direct paths are considered in order of increasing use as a current path for other devices. If the selected path is useable and is not the current path an automatic path switch is performed and an OPCOM message is issued.

6.7.9 How OpenVMS Performs Multipath Failover

When an I/O operation fails on a device that is subject to mount verification, and the failure status suggests that retries are warranted, mount verification is invoked. If the device is a multipath device or a shadow set that includes multipath devices, alternate paths to the device are automatically tried during mount verification. This allows the system to recover transparently from cases in which the device has failed over from one HSx controller or MDR to another, and to recover transparently from failures in the path to the device.

The following devices are subject to mount verification:

  • Disk devices that are mounted as Files-11 volumes, including host-based volume shadowing sets
  • Tape devices that are mounted as ANSI tape volumes
  • Tape devices that are mounted as foreign volumes

Note that foreign mounted disk volumes and generic SCSI devices (GG and GK) are not subject to mount verification and, therefore, are not eligible for automatic multipath failover.

Path selection during mount verification proceeds as follows:

  1. If the current path is a direct path and access to the device on this path does not require controller failover, the current path is tried.
  2. If the current path is an MSCP path and it was selected by a manual path switch command, the current path is tried (disks only).
  3. All direct paths are checked, starting with the path that is the current path for the fewest devices. The direct paths are considered in order of increasing use as a current path for other devices. If a path is found on which access to the device does not require a controller failover, that path is selected.
  4. Step 3 is repeated the number of times specified by the MPDEV_LCRETRIES system parameter. This provides additional bias toward selection of a path that does not require an HSx or MDR controller failover. The default value for MPDEV_LCRETRIES is 1.
  5. All direct paths are tried, starting with the path that is the current path for the fewest devices. The direct paths are considered in order of increasing use as a current path for other devices. If necessary, an attempt is made to fail over the device to the HSx or MDR controller on the selected path.
  6. If present, the MSCP served path is tried (disks only).

Steps 1 through 6 are repeated until either a working path is found or mount verification times out. If a working path is found and the path is different, the current path is automatically switched to the new path and an OPCOM message is issued. Mount verification then completes, the failed I/O is restarted, and new I/O is allowed to proceed. This path selection procedure attempts to avoid unnecessary failover of devices from one HSx controller to another because:

  • Failover from one HSx controller module to another causes a delay of approximately 1 to 15 seconds, depending on the amount of cached data that needs to be synchronized.
  • Other nodes that share access to the device must reestablish communication using an alternate path.

This path selection procedure prefers direct paths over MSCP served paths because the use of an MSCP served path imposes additional CPU and I/O overhead on the server system. In contrast, the use of a direct path imposes no additional CPU or I/O overhead on the MSCP-server system. This procedure selects an MSCP served path only if none of the available direct paths are working. Furthermore, this path selection procedure tends to balance the use of available direct paths, subject to the constraint of avoiding unnecessary controller failovers.

6.7.10 Automatic Failback to a Direct Path (Disks Only)

Multipath failover, as described in Section 6.7.9, applies to MSCP served paths as well. That is, if the current path is via an MSCP served path and the served path fails, mount verification can trigger an automatic failover back to a working direct path.

However, an I/O error on the MSCP path is required to trigger the failback to a direct path. Consider the following sequence of events:

  • A direct path is being used to a device.
  • All direct paths fail and the next I/O to the device fails.
  • Mount verification provokes an automatic path switch to the MSCP served path.
  • Sometime later, a direct path to the device is restored.

In this case, the system would continue to use the MSCP served path to the device, even though the direct path is preferable. This is because no error occurred on the MSCP served path to provoke the path selection procedure.

The automatic failback feature is designed to address this situation. Multipath polling attempts to fail back the device to a direct path when it detects that all of the following conditions apply:

  • A direct path to a device is responsive.
  • The current path to the device is an MSCP served path.
  • The current path was not selected by a manual path switch.
  • An automatic failback has not been attempted on this device within the last MPDEV_AFB_INTVL seconds.

The automatic failback is attempted by triggering mount verification and, as a result, the automatic failover procedure on the device.

The main purpose of multipath polling is to test the status of unused paths in order to avoid situations such as the following:

  • A system has paths A and B to a device.
  • The system is using path A.
  • If path B becomes inoperative, it goes unnoticed.
  • Much later, and independently, path A breaks.
  • The system attempts to fail over to path B but finds that it is broken.

The poller would detect the failure of path B within 1 minute of its failure and would issue an OPCOM message. An alert system manager can initiate corrective action immediately.

Note that a device might successfully respond to the SCSI INQUIRY commands that are issued by the path poller but might fail to complete a path switch or mount verification successfully on that path. A system manager or operator can control automatic failback in three ways:

  1. Specify the minimum interval between automatic failback attempts on any given device by the MPDEV_AFB_INTVL system parameter. The default value is 300 seconds.
  2. Prevent automatic failback by setting the value of MPDEV_AFB_INTVL to 0. If set to 0, no automatic failback is attempted on any device on this system.
  3. Temporarily disable automatic failback on a specific device by manually switching the device to the MSCP served path. You can do this even if the current path is an MSCP served path.

Because of the path selection procedure, the automatic failover procedure, and the automatic failback feature, the current path to a mounted device is usually a direct path when there are both direct and MSCP served paths to that device. The primary exceptions to this are when the path has been manually switched to the MSCP served path or when there are no working direct paths.

6.7.11 Enabling or Disabling Paths as Path Switch Candidates

By default, all paths are candidates for path switching. You can disable or re-enable a path as a switch candidate by using the SET DEVICE command with the /[NO]ENABLE qualifier. The reasons you might want to do this include the following:

  • You know a specific path is broken, or that a failover to that path will cause some members of the cluster to lose access.
  • To prevent automatic switching to a selected path while it is being serviced.

Note that the current path cannot be disabled.

The command syntax for enabling a disabled path is:


SET DEVICE device-name/[NO]ENABLE/PATH=path-identifier

The following command enables the MSCP served path of device $2$DKA502.


$ SET DEVICE $2$DKA502/ENABLE/PATH=MSCP

The following command disables a local path of device $2$DKA502.


$ SET DEVICE $2$DKA502/ENABLE/PATH=PKC0.5

Be careful when disabling paths. Avoid creating an invalid configuration, such as the one shown in Figure 6-21.

6.7.12 Performance Considerations

The presence of an MSCP served path in a disk multipath set has no measurable effect on steady-state I/O performance when the MSCP path is not the current path.

Note that the presence of an MSCP served path in a multipath set might increase the time it takes to find a working path during mount verification under certain, unusual failure cases. Because direct paths are tried first, the presence of an MSCP path should not affect recovery time.

However, the ability to switch dynamically from a direct path to an MSCP served path might significantly increase the I/O serving load on a given MSCP server system with a direct path to the multipath disk storage. Because served I/O takes precedence over almost all other activity on the MSCP server, failover to an MSCP served path can affect the reponsiveness of other applications on that MSCP server, depending on the capacity of the server and the increased rate of served I/O requests.

For example, a given OpenVMS Cluster configuration might have sufficient CPU and I/O bandwidth to handle an application work load when all the shared SCSI storage is accessed by direct SCSI paths. Such a configuration might be able to work acceptably as failures force a limited number of devices to switch over to MSCP served paths. However, as more failures occur, the load on the MSCP served paths could approach the capacity of the cluster and cause the performance of the application to degrade to an unacceptable level.

The MSCP_BUFFER and MSCP_CREDITS system parameters allow the system manager to control the resources allocated to MSCP serving. If the MSCP server does not have enough resources to serve all incoming I/O requests, performance will degrade on systems that are accessing devices on the MSCP path on this MSCP server.

You can use the MONITOR MSCP command to determine whether the MSCP server is short of resources. If the Buffer Wait Rate is nonzero, the MSCP server has had to stall some I/O while waiting for resources.

It is not possible to recommend correct values for these parameters. However, note that, starting with OpenVMS Alpha Version 7.2-1, the default value for MSCP_BUFFER has been increased from 128 to 1024.

As noted in the online help for the SYSGEN utility, MSCP_BUFFER specifies the number of pagelets to be allocated to the MSCP server's local buffer area, and MSCP_CREDITS specifies the number of outstanding I/O requests that can be active from one client system. For example, a system with many disks being served to several OpenVMS systems might have MSCP_BUFFER set to a value of 4000 or higher and MSCP_CREDITS set to 128 or higher.

For information about modifying system parameters, see the HP OpenVMS System Manager's Manual.

HP recommends that you test configurations that rely on failover to MSCP served paths at the worst-case load level for MSCP served paths. If you are configuring a multiple-site disaster-tolerant cluster that uses a multiple-site SAN, consider the possible failures that can partition the SAN and force the use of MSCP served paths. In a symmetric dual-site configuration, HP recommends that you provide capacity for 50 percent of the SAN storage to be accessed by an MSCP served path.

You can test the capacity of your configuration by using manual path switching to force the use of MSCP served paths.

6.7.13 Console Considerations

This section describes how to use the console with parallel SCSI multipath disk devices. See Section 7.6 for information on using the console with FC multipath devices.

The console uses traditional, path-dependent, SCSI device names. For example, the device name format for disks is DK, followed by a letter indicating the host adapter, followed by the SCSI target ID, and the LUN.

This means that a multipath device will have multiple names, one for each host adapter it is accessible through. In the following sample output of a console show device command, the console device name is in the left column. The middle column and the right column provide additional information, specific to the device type.

Notice, for example, that the devices dkb100 and dkc100 are really two paths to the same device. The name dkb100 is for the path through adapter PKB0, and the name dkc100 is for the path through adapter PKC0. This can be determined by referring to the middle column, where the informational name includes the HSZ allocation class. The HSZ allocation class allows you to determine which console "devices" are really paths to the same HSZ device.

Note

The console may not recognize a change in the HSZ allocation class value until after you issue a console INIT command.


>>>sho dev
dkb0.0.0.12.0              $55$DKB0                       HSZ70CCL  XB26
dkb100.1.0.12.0            $55$DKB100                        HSZ70  XB26
dkb104.1.0.12.0            $55$DKB104                        HSZ70  XB26
dkb1300.13.0.12.0          $55$DKB1300                       HSZ70  XB26
dkb1307.13.0.12.0          $55$DKB1307                       HSZ70  XB26
dkb1400.14.0.12.0          $55$DKB1400                       HSZ70  XB26
dkb1500.15.0.12.0          $55$DKB1500                       HSZ70  XB26
dkb200.2.0.12.0            $55$DKB200                        HSZ70  XB26
dkb205.2.0.12.0            $55$DKB205                        HSZ70  XB26
dkb300.3.0.12.0            $55$DKB300                        HSZ70  XB26
dkb400.4.0.12.0            $55$DKB400                        HSZ70  XB26
dkc0.0.0.13.0              $55$DKC0                       HSZ70CCL  XB26
dkc100.1.0.13.0            $55$DKC100                        HSZ70  XB26
dkc104.1.0.13.0            $55$DKC104                        HSZ70  XB26
dkc1300.13.0.13.0          $55$DKC1300                       HSZ70  XB26
dkc1307.13.0.13.0          $55$DKC1307                       HSZ70  XB26
dkc1400.14.0.13.0          $55$DKC1400                       HSZ70  XB26
dkc1500.15.0.13.0          $55$DKC1500                       HSZ70  XB26
dkc200.2.0.13.0            $55$DKC200                        HSZ70  XB26
dkc205.2.0.13.0            $55$DKC205                        HSZ70  XB26
dkc300.3.0.13.0            $55$DKC300                        HSZ70  XB26
dkc400.4.0.13.0            $55$DKC400                        HSZ70  XB26
dva0.0.0.1000.0            DVA0
ewa0.0.0.11.0              EWA0              08-00-2B-E4-CF-0B
pka0.7.0.6.0               PKA0                  SCSI Bus ID 7
pkb0.7.0.12.0              PKB0                  SCSI Bus ID 7  5.54
pkc0.7.0.13.0              PKC0                  SCSI Bus ID 7  5.54

The console does not automatically attempt to use an alternate path to a device if I/O fails on the current path. For many console commands, however, it is possible to specify a list of devices that the console will attempt to access in order. In a multipath configuration, you can specify a list of console device names that correspond to the multiple paths of a device. For example, a boot command, such as the following, will cause the console to attempt to boot the multipath device through the DKB100 path first, and if that fails, it will attempt to boot through the DKC100 path:


BOOT DKB100, DKC100


Previous Next Contents Index