[an error occurred while processing this directive]

HP OpenVMS Systems Documentation

Content starts here Guidelines for OpenVMS Cluster Configurations

Guidelines for OpenVMS Cluster Configurations


Previous Contents Index

8.8.2 Advantages

This configuration offers the following advantages:

  • All nodes have direct access to all storage.
  • SCSI storage provides low-cost, commodity hardware with good performance.
  • The MEMORY CHANNEL interconnect provides high-performance, node-to-node communication at a low price. The SCSI interconnect complements MEMORY CHANNEL by providing low-cost, commodity storage communication.

8.8.3 Disadvantages

This configuration has the following disadvantage:

  • The fast-wide differential SCSI bus is a single point of failure. One solution is to add a second, fast-wide differential SCSI bus so that if one fails, the nodes can fail over to the other. To use this functionality, the systems must be running OpenVMS Version 7.2 or higher and have multipath support enabled.

8.8.4 Key Availability Strategies

The configuration in Figure 8-5 incorporates the following strategies, which are critical to its success:

  • Redundant MEMORY CHANNEL hubs and HSZ controllers prevent a single point of hub or controller failure.
  • Volume shadowing provides multiple copies of essential disks across separate HSZ controllers.
  • All nodes have shared, direct access to all storage.
  • At least three nodes are used for quorum, so the OpenVMS Cluster continues if any one node fails.

8.9 Availability in an OpenVMS Cluster with Satellites

Satellites are systems that do not have direct access to a system disk and other OpenVMS Cluster storage. Satellites are usually workstations, but they can be any OpenVMS Cluster node that is served storage by other nodes in the cluster.

Because satellite nodes are highly dependent on server nodes for availability, the sample configurations presented earlier in this chapter do not include satellite nodes. However, because satellite/server configurations provide important advantages, you may decide to trade off some availability to include satellite nodes in your configuration.

Figure 8-6 shows an optimal configuration for a OpenVMS Cluster system with satellites. Figure 8-6 is followed by an analysis of the configuration that includes:

  • Analysis of its components
  • Advantages and disadvantages
  • Key availability strategies implemented

Figure 8-6 OpenVMS Cluster with Satellites


8.9.1 Components

This satellite/server configuration in Figure 8-6 has the following components:

Part Description
1 Base configuration.

The base configuration performs server functions for satellites.

2 Three to 16 OpenVMS server nodes.

Rationale: At least three nodes are recommended to maintain quorum. More than 16 nodes introduces excessive complexity.

3 Two Ethernet segments between base server nodes and satellites.

Rationale: Provides high availability.

4 Two Ethernet segments attached to each critical satellite with two Ethernet adapters. Each of these critical satellites has its own system disk.

Rationale: Having their own boot disks increases the availability of the critical satellites.

5 For noncritical satellites, place a boot server on the Ethernet segment.

Rationale: Noncritical satellites do not need their own boot disks.

6 Limit the satellites to 15 per segment.

Rationale: More than 15 satellites on a segment may cause I/O congestion.

8.9.2 Advantages

This configuration provides the following advantages:

  • A large number of nodes can be served in one OpenVMS Cluster.
  • You can spread a large number of nodes over a greater distance.

8.9.3 Disadvantages

This configuration has the following disadvantages:

  • Satellites with single LAN adapters have a single point of failure that causes cluster transitions if the adapter fails.
  • High cost of LAN connectivity for highly available satellites.

8.9.4 Key Availability Strategies

The configuration in Figure 8-6 incorporates the following strategies, which are critical to its success:

  • This configuration has no single point of failure.
  • All shared storage is MSCP served from the base configuration, which is appropriately configured to serve a large number of nodes.

8.10 Multiple-Site OpenVMS Cluster System

Multiple-site OpenVMS Cluster configurations contain nodes that are located at geographically separated sites. Depending on the technology used, the distances between sites can be as great as 150 miles. FDDI, and DS3 are used to connect these separated sites to form one large cluster. Available from most common telephone service carriers and DS3 services provide long-distance, point-to-point communications for multiple-site clusters.

Figure 8-7 shows a typical configuration for a multiple-site OpenVMS Cluster system. Figure 8-7 is followed by an analysis of the configuration that includes:

  • Analysis of components
  • Advantages

Figure 8-7 Multiple-Site OpenVMS Cluster Configuration Connected by WAN Link


8.10.1 Components

Although Figure 8-7 does not show all possible configuration combinations, a multiple-site OpenVMS Cluster can include:

  • Two data centers with an intersite link (FDDI, DS3) connected to a DECconcentrator or GIGAswitch crossbar switch.
  • Intersite link performance that is compatible with the applications that are shared by the two sites.
  • Up to 96 Integrity servers and Alpha (combined total) nodes. In general, the rules that apply to OpenVMS LAN and extended LAN (ELAN) clusters also apply to multiple-site clusters.
    Reference: For LAN configuration guidelines, see Section 4.10.2. For ELAN configuration guidelines, see Section 9.3.8.

8.10.2 Advantages

The benefits of a multiple-site OpenVMS Cluster system include the following:

  • A few systems can be remotely located at a secondary site and can benefit from centralized system management and other resources at the primary site. For example, a main office data center could be linked to a warehouse or a small manufacturing site that could have a few local nodes with directly attached, site-specific devices. Alternatively, some engineering workstations could be installed in an office park across the city from the primary business site.
  • Multiple sites can readily share devices such as high-capacity computers, tape libraries, disk archives, or phototypesetters.
  • Backups can be made to archival media at any site in the cluster. A common example would be to use disk or tape at a single site to back up the data for all sites in the multiple-site OpenVMS Cluster. Backups of data from remote sites can be made transparently (that is, without any intervention required at the remote site).
  • In general, a multiple-site OpenVMS Cluster provides all of the availability advantages of a LAN OpenVMS Cluster. Additionally, by connecting multiple, geographically separate sites, multiple-site OpenVMS Cluster configurations can increase the availability of a system or elements of a system in a variety of ways:
    • Logical volume/data availability---Volume shadowing or redundant arrays of independent disks (RAID) can be used to create logical volumes with members at both sites. If one of the sites becomes unavailable, data can remain available at the other site.
    • Site failover---By adjusting the VOTES system parameter, you can select a preferred site to continue automatically if the other site fails or if communications with the other site are lost.

Reference: For additional information about multiple-site clusters, see HP OpenVMS Cluster Systems.

8.11 Disaster-Tolerant OpenVMS Cluster Configurations

Disaster-tolerant OpenVMS Cluster configurations make use of Volume Shadowing for OpenVMS, high-speed networks, and specialized management software.

Disaster-tolerant OpenVMS Cluster configurations enable systems at two different geographic sites to be combined into a single, manageable OpenVMS Cluster system. Like the multiple-site cluster discussed in the previous section, these physically separate data centers are connected by FDDI or by a combination of FDDI and T3, or E3.

The OpenVMS disaster-tolerant product was formerly named the Business Recovery Server (BRS). BRS has been subsumed by a services offering named Disaster Tolerant Cluster Services, which is a system management and software service package. For more information about Disaster Tolerant Cluster Services, contact your HP Services representative.


Chapter 9
Configuring OpenVMS Clusters for Scalability

This chapter explains how to maximize scalability in many different kinds of OpenVMS Clusters.

9.1 What Is Scalability?

Scalability is the ability to expand an OpenVMS Cluster system in any system, storage, and interconnect dimension and at the same time fully use the initial configuration equipment. Your OpenVMS Cluster system can grow in many dimensions, as shown in Figure 9-1. Each dimension also enables your applications to expand.

Figure 9-1 OpenVMS Cluster Growth Dimensions


9.1.1 Scalable Dimensions

Table 9-1 describes the growth dimensions for systems, storage, and interconnects in OpenVMS Clusters.

Table 9-1 Scalable Dimensions in OpenVMS Clusters
This Dimension Grows by...
Systems
CPU Implementing SMP within a system.
Adding systems to a cluster.
Accommodating various processor sizes in a cluster.
Adding a bigger system to a cluster.
Migrating from Alpha systems to Integrity servers.
Memory Adding memory to a system.
I/O Adding interconnects and adapters to a system.
Adding MEMORY CHANNEL to a cluster to offload the I/O interconnect.
OpenVMS Tuning system parameters.
Moving to OpenVMS Integrity servers.
Adapter Adding storage adapters to a system.
Adding LAN adapters to a system.
Storage
Media Adding disks to a cluster.
Adding tapes and CD-ROMs to a cluster.
Volume shadowing Increasing availability by shadowing disks.
Shadowing disks across controllers.
Shadowing disks across systems.
I/O Adding solid-state or DECram disks to a cluster.
Adding disks and controllers with caches to a cluster.
Adding RAID disks to a cluster.
Controller and array Moving disks and tapes from systems to controllers.
Combining disks and tapes in arrays.
Adding more controllers and arrays to a cluster.
Interconnect
LAN Adding Ethernet segments.
Upgrading from Ethernet to Fast Ethernet, Gigabit Ethernet or 10 Gigabit Ethernet.
Adding redundant segments and bridging segments.
Fibre Channel, SCSI, SAS, and MEMORY CHANNEL Adding Fibre Channel, SCSI, SAS, and MEMORY CHANNEL interconnects to a cluster or adding redundant interconnects to a cluster.
I/O Adding faster interconnects for capacity.
Adding redundant interconnects for capacity and availability.
Distance Expanding a cluster inside a room or a building.
Expanding a cluster across a town or several buildings.
Expanding a cluster between two sites (spanning 40 km).

The ability to add to the components listed in Table 9-1 in any way that you choose is an important feature that OpenVMS Clusters provide. You can add hardware and software in a wide variety of combinations by carefully following the suggestions and guidelines offered in this chapter and in the products' documentation. When you choose to expand your OpenVMS Cluster in a specific dimension, be aware of the advantages and tradeoffs with regard to the other dimensions. Table 9-2 describes strategies that promote OpenVMS Cluster scalability. Understanding these scalability strategies can help you maintain a higher level of performance and availability as your OpenVMS Cluster grows.

9.2 Strategies for Configuring a Highly Scalable OpenVMS Cluster

The hardware that you choose and the way that you configure it has a significant impact on the scalability of your OpenVMS Cluster. This section presents strategies for designing an OpenVMS Cluster configuration that promotes scalability.

9.2.1 Scalability Strategies

Table 9-2 lists strategies in order of importance that ensure scalability. This chapter contains many figures that show how these strategies are implemented.

Table 9-2 Scalability Strategies
Strategy Description
Capacity planning Running a system above 80% capacity (near performance saturation) limits the amount of future growth possible.

Understand whether your business and applications will grow. Try to anticipate future requirements for processor, memory, and I/O.

Shared, direct access to all storage The ability to scale compute and I/O performance is heavily dependent on whether all of the systems have shared, direct access to all storage.

The FC and LAN OpenVMS Cluster illustrations that follow show many examples of shared, direct access to storage, with no MSCP overhead.

Reference: For more information about MSCP overhead, see Section 9.5.1.

Limit node count to between 3 and 16 Smaller OpenVMS Clusters are simpler to manage and tune for performance and require less OpenVMS Cluster communication overhead than do large OpenVMS Clusters. You can limit node count by upgrading to a more powerful processor and by taking advantage of OpenVMS SMP capability.

If your server is becoming a compute bottleneck because it is overloaded, consider whether your application can be split across nodes. If so, add a node; if not, add a processor (SMP).

Remove system bottlenecks To maximize the capacity of any OpenVMS Cluster function, consider the hardware and software components required to complete the function. Any component that is a bottleneck may prevent other components from achieving their full potential. Identifying bottlenecks and reducing their effects increases the capacity of an OpenVMS Cluster.
Enable the MSCP server The MSCP server enables you to add satellites to your OpenVMS Cluster so that all nodes can share access to all storage. In addition, the MSCP server provides failover for access to shared storage when an interconnect fails.
Reduce interdependencies and simplify configurations An OpenVMS Cluster system with one system disk is completely dependent on that disk for the OpenVMS Cluster to continue. If the disk, the node serving the disk, or the interconnects between nodes fail, the entire OpenVMS Cluster system may fail.
Ensure sufficient serving resources If a small disk server has to serve a large number disks to many satellites, the capacity of the entire OpenVMS Cluster is limited. Do not overload a server because it will become a bottleneck and will be unable to handle failover recovery effectively.
Configure resources and consumers close to each other Place servers (resources) and satellites (consumers) close to each other. If you need to increase the number of nodes in your OpenVMS Cluster, consider dividing it. See Section 10.2.4 for more information.
Set adequate system parameters If your OpenVMS Cluster is growing rapidly, important system parameters may be out of date. Run AUTOGEN, which automatically calculates significant system parameters and resizes page, swap, and dump files.

9.2.2 Three-Node Fast-Wide SCSI Cluster

In Figure 9-2, three nodes are connected by two 25-m, fast-wide (FWD) SCSI interconnects. Multiple storage shelves are contained in each HSZ controller, and more storage is contained in the BA356 at the top of the figure.

Figure 9-2 Three-Node Fast-Wide SCSI Cluster


The advantages and disadvantages of the configuration shown in Figure 9-2 include:

Advantages

  • Combines the advantages of the configurations of two-node fast-wide SCSI cluster and two-node fast-wide SCSI cluster with HSZ storage:
    • Significant (25 m) bus distance and scalability.
    • Includes cache in the HSZ, which also provides RAID 0, 1, and 5 technologies. The HSZ contains multiple storage shelves.
    • FWD bus provides 20 MB/s throughput.
    • With the BA356 cabinet, you can use narrow (8 bit) or wide (16 bit) SCSI bus.

Disadvantage

  • This configuration is more expensive than those shown in previous figures.

9.2.3 Four-Node Ultra SCSI Hub Configuration

Figure 9-3 shows four nodes connected by a SCSI hub. The SCSI hub obtains power and cooling from the storage cabinet, such as the BA356. The SCSI hub does not connect to the SCSI bus of the storage cabinet.

Figure 9-3 Four-Node Ultra SCSI Hub Configuration


The advantages and disadvantages of the configuration shown in Figure 9-3 include:

Advantages

  • Provides significantly more bus distance and scalability.
  • The SCSI hub provides fair arbitration on the SCSI bus. This provides more uniform, predictable system behavior. Four CPUs are allowed only when fair arbitration is enabled.
  • Up to two dual HSZ controllers can be daisy-chained to the storage port of the hub.
  • Two power supplies in the BA356 (one for backup).
  • Cache in the HSZs, which also provides RAID 0, 1, and 5 technologies.
  • Ultra SCSI bus provides 40 MB/s throughput.

Disadvantage

  • You cannot add CPUs to this configuration by daisy-chaining a SCSI interconnect from a CPU or HSZ to another CPU.
  • This configuration is more expensive than the two-node fast-wide SCSI cluster and two-node fast-wide SCSI cluster HSZ storage.
  • Only HSZ storage can be connected. You cannot attach a storage shelf with disk drives directly to the SCSI hub.

9.3 Scalability in OpenVMS Clusters with Satellites

The number of satellites in an OpenVMS Cluster and the amount of storage that is MSCP served determine the need for the quantity and capacity of the servers. Satellites are systems that do not have direct access to a system disk and other OpenVMS Cluster storage. Satellites are usually workstations, but they can be any OpenVMS Cluster node that is served storage by other nodes in the OpenVMS Cluster.

Each Ethernet LAN segment should have only 10 to 20 satellite nodes attached. Figure 9-4, Figure 9-5, Figure 9-6, and Figure 9-7 show a progression from a 6-satellite LAN to a 45-satellite LAN.

9.3.1 Six-Satellite OpenVMS Cluster

In Figure 9-4, six satellites and a boot server are connected by Ethernet.

Figure 9-4 Six-Satellite LAN OpenVMS Cluster


The advantages and disadvantages of the configuration shown in Figure 9-4 include:

Advantages

  • The MSCP server is enabled for adding satellites and allows access to more storage.
  • With one system disk, system management is relatively simple.
    Reference: For information about managing system disks, see Section 10.2.

Disadvantage

  • The Ethernet is a potential bottleneck and a single point of failure.

If the boot server in Figure 9-4 became a bottleneck, a configuration like the one shown in Figure 9-5 would be required.


Previous Next Contents Index