Chapter 10 |
10
|
Maintaining an OpenVMS Cluster System
|
10.1
|
Backing Up Data and Files
|
10.2
|
Updating the OpenVMS Operating System
|
10.2.1
|
Rolling Upgrades
|
10.3
|
LAN Network Failure Analysis
|
10.4
|
Recording Configuration Data
|
10.4.1
|
Record Information
|
10.4.2
|
Satellite Network Data
|
10.5
|
Controlling OPCOM Messages
|
10.5.1
|
Overriding OPCOM Defaults
|
10.5.2
|
Example
|
10.6
|
Shutting Down a Cluster
|
10.6.1
|
The NONE Option
|
10.6.2
|
The REMOVE_NODE Option
|
10.6.3
|
The CLUSTER_SHUTDOWN Option
|
10.6.4
|
The REBOOT_CHECK Option
|
10.6.5
|
The SAVE_FEEDBACK Option
|
10.6.6
|
Shutting Down TCP/IP
|
10.7
|
Dump Files
|
10.7.1
|
Controlling Size and Creation
|
10.7.2
|
Sharing Dump Files
|
10.8
|
Maintaining the Integrity of OpenVMS Cluster Membership
|
10.8.1
|
Cluster Group Data
|
10.8.2
|
Example
|
10.9
|
Adjusting Packet Size for LAN or IP Configurations
|
10.9.1
|
System Parameter Settings for LANs and IPs
|
10.9.2
|
How to Use NISCS_MAX_PKTSZ
|
10.9.3
|
How to Use NISCS_UDP_PKTSZ
|
10.9.4
|
Editing Parameter Files
|
10.10
|
Determining Process Quotas
|
10.10.1
|
Quota Values
|
10.10.2
|
PQL Parameters
|
10.10.3
|
Examples
|
10.11
|
Restoring Cluster Quorum
|
10.11.1
|
Restoring Votes
|
10.11.2
|
Reducing Cluster Quorum Value
|
10.12
|
Cluster Performance
|
10.12.1
|
Using the SHOW Commands
|
10.12.2
|
Using the Monitor Utility
|
10.12.3
|
Using HP Availability Manager
|
10.12.4
|
Monitoring LAN Activity
|
10.12.5
|
LAN or PEDRIVER Fast Path Settings
|
Appendix A |
Appendix A
|
Cluster System Parameters
|
A.1
|
Values
|
Appendix B |
Appendix B
|
Building Common Files
|
B.1
|
Building a Common SYSUAF.DAT File
|
B.2
|
Merging RIGHTSLIST.DAT Files
|
Appendix C |
Appendix C
|
Cluster Troubleshooting
|
C.1
|
Diagnosing Computer Failures
|
C.1.1
|
Preliminary Checklist
|
C.1.2
|
Sequence of Booting Events
|
C.2
|
Satellite Fails to Boot
|
C.2.1
|
Displaying Connection Messages
|
C.2.2
|
General OpenVMS Cluster Satellite-Boot Troubleshooting
|
C.2.3
|
MOP Server Troubleshooting
|
C.2.4
|
Disk Server Troubleshooting
|
C.2.5
|
Satellite Booting Troubleshooting
|
C.2.6
|
Alpha Booting Messages (Alpha Only)
|
C.3
|
Computer Fails to Join the Cluster
|
C.3.1
|
Verifying OpenVMS Cluster Software Load
|
C.3.2
|
Verifying Boot Disk and Root
|
C.3.3
|
Verifying SCSNODE and SCSSYSTEMID Parameters
|
C.3.4
|
Verifying Cluster Security Information
|
C.4
|
Startup Procedures Fail to Complete
|
C.5
|
Diagnosing LAN Component Failures
|
C.6
|
Diagnosing Cluster Hangs
|
C.6.1
|
Cluster Quorum is Lost
|
C.6.2
|
Inaccessible Cluster Resource
|
C.7
|
Diagnosing CLUEXIT Bugchecks
|
C.7.1
|
Conditions Causing Bugchecks
|
C.8
|
Port Communications
|
C.8.1
|
LAN Communications
|
C.8.2
|
System Communications Services (SCS) Connections
|
C.9
|
Diagnosing Port Failures
|
C.9.1
|
Hierarchy of Communication Paths
|
C.9.2
|
Where Failures Occur
|
C.9.3
|
Verifying Virtual Circuits
|
C.9.4
|
Verifying LAN Connections
|
C.10
|
Analyzing Error-Log Entries for Port Devices
|
C.10.1
|
Examine the Error Log
|
C.10.2
|
Formats
|
C.10.3
|
LAN Device-Attention Entries
|
C.10.4
|
Logged Message Entries
|
C.10.5
|
Error-Log Entry Descriptions
|
C.11
|
OPA0 Error-Message Logging and Broadcasting
|
C.11.1
|
OPA0 Error Messages
|
C.12
|
Integrity server Satellite Booting Messages
|
Appendix D |
Appendix D
|
Sample Programs for LAN Control
|
D.1
|
Purpose of Programs
|
D.2
|
Starting the NISCA Protocol
|
D.2.1
|
Start the Protocol
|
D.3
|
Stopping the NISCA Protocol
|
D.3.1
|
Stop the Protocol
|
D.3.2
|
Verify Successful Execution
|
D.4
|
Analyzing Network Failures
|
D.4.1
|
Failure Analysis
|
D.4.2
|
How the LAVC$FAILURE_ANALYSIS Program Works
|
D.5
|
Using the Network Failure Analysis Program
|
D.5.1
|
Create a Network Diagram
|
D.5.2
|
Edit the Source File
|
D.5.3
|
Assemble and Link the Program
|
D.5.4
|
Modify Startup Files
|
D.5.5
|
Execute the Program
|
D.5.6
|
Modify MODPARAMS.DAT
|
D.5.7
|
Test the Program
|
D.5.8
|
Display Suspect Components
|