HP Availability Manager - Version 3.0-2 Release Notes Febuary 2009 ______________________ IMPORTANT ______________________ In these Release Notes, the term Combined kit refers to the Data Analyzer/Data Server kit. The Release Notes documents additions and changes for version 3.0-2 of the Combined kit and version 3.0-2A of the Data Collector kit. Be sure to read the following sections: o Section 1.1-contains information about WAN support and security measures for communication between the Data Analyzer and Data Server. o Section 1.3 and Section 1.4. Note Section 1.3 describes the requirement that the installation disk be an ODS-5 formatted disk on OpenVMS systems using the Combined kit. Note Section 1.4 describes how to install the Combined kit on any ODS-5 formatted disk. o Section 1.2-explains that the AVAIL_MAN_ANA kit has been renamed to AVAIL_MAN_ANA_SRVR. ______________________________________________________ The following notes address late-breaking information and known problems for the HP Availability Manager Version 3.0-2. These notes are in the following categories: o New and changed features o Installation notes o Problems corrected o Operation notes o Display notes 1 1 New and Changed Features The following section discusses new and changed features introduced in Version 3.0-2 of the Availability Manager. 1.1 Data Collection Over a Wide-Area Network (WAN) The Availability Manager can now collect data over a WAN. This possible because a new software component has been added to the Availability Manager software: the Data Server. The Data Server resides on the same local-area network (LAN) as the OpenVMS systems that you want to monitor. The Data Analyzer connects to the Data Server over the WAN. Using the IP protocol suite, data packets are transmitted between the Data Analyzer and the Data Server. This method of transmitting data packets allows the Data Analyzer to reside anywhere on the WAN. Data Server setup and usage details are in the HP OpenVMS Availability Manager User's Guide, Sections 1.1 and 1.2, as well as Sections 2.2 through 2.6. The Availability Manager uses Transport Layer Security (TLS) for secure WAN communication between the Data Analyzer and Data Server. See the Chapter 2 of HP OpenVMS Availability Manager User's Guide for steps that you must perform in order to set up this security mechanism. 1.2 Data Analyzer Kit Renamed to AVAIL_MAN_ANA_SRVR The AVAIL_MAN_ANA kit has been renamed to AVAIL_MAN_ANA_ SRVR because the new kit contains both the Data Analyzer and Data Server. ________________________ Note ________________________ You need to uninstall the AVAIL_MAN_ANA kit before installing the new AVAIL_MAN_ANA_SRVR kit. ______________________________________________________ 2 1.3 Use of Java Version 5.0 for the Combined Kit The Availability Manager Combined kit uses the 5.0 version of the Java Virtual Machine. (Java Version 5.0 is also known as 1.5.) The following changes are the result of this Java upgrade: o For OpenVMS systems using the Data Analyzer or Data Server, the installation disk must be an ODS-5 disk. o Performance on OpenVMS has improved. 1.4 Ability to Install the Combined Kit on a Non-system Disk In Version 3.0-2, you can install the Combined kit, AVAIL_ MAN_ANA_SRVR, on a non-system ODS-5 formatted disk. This kit removes the requirement that the kit be installed on the system disk. To install on a non-system ODS-5 formatted disk, use the /DESTINATION qualifier of the $ PRODUCT INSTALL command to specify the installation disk and directory; for example: $ PRODUCT INSTALL AVAIL_MAN_ANA_SRVR /DESTINATION=DISK$SYS_V83:[SOFTWARE] The manual HP Availability Manager Installation Instructions details how to do the installation. 1.5 New Data Columns in System Overview and Related Event Additions and Changes Version 3.0-2 introduces three new data columns in the System Overview Window. Table_1_New_Data_Columns___________________________________ Data Column____Description______________________________________ PFLTS Total page fault rate and hard fault rate PFW/COM Number of processes in PFW and COM states DC Data Collector capability version and Managed __________Object_registration_status_______________________ Version 3.0-2 also changed the CPU Qs column. The states summed in this column are now COLPG, COMO, FPG, and MWAIT. 3 The column also displays the individual state counts if the total is greater than zero. The COM and PFW state counts that used to be included in the CPU Qs count have been moved to the new PFW/COM column. New events in Version 3.0-2 follow. The last three result from the new PFW/COM column and changes to the CPU Qs column. Table_2_New_Events_________________________________________ Data Column____Description______________________________________ HIALNR High alignment fault rate HICMOQ Many processes waiting in COMO HIPFWQ Many processes waiting in PFW state PRCCMO____Process_waiting_in_COMO__________________________ These events have changed as a result of the column changes. Table_3_Changed_Events_____________________________________ Event_____Description______________________________________ HICOMQ Many processes waiting in COM HIPWTQ Many processes waiting in COLPG or FPG PRCCOM____Process_waiting_in_COM___________________________ 1.6 Data Collection for Logical Disk (LDcn:) devices The Data Analyzer now collects data on Logical Disks. These devices show up in the Disk Status and Disk Volume displays. 4 1.7 System Memory and Alignment Faults in Mem Column Tooltip The Mem column in the System Overview Window now shows the total amount of memory on the system, and the number of page faults by mode. 1.8 Event Log Enhancements With the advent of WAN support, there is an Event Log for each connection that the Data Analyzer is using, instead of one Event Log for the Data Analyzer session. The Event Log now contains an entry when a threshold event has been cancelled. This new entry allows you to determine how long a threshold has been exceeded. Along with the new entry for cancelled threshold events, three new columns have been added to the Event Log. These columns are as follows: Table_4_New_Event_Columns__________________________________ Event Column______Description____________________________________ (continued on next page) 5 Table_4_(Cont.)_New_Event_Columns__________________________ Event Column______Description____________________________________ Status The value describes the status of the event. Values are as follows: _______________________________________________ Status Value_____Description__________________________ INFO This event is informational. BEGIN The event entry marks the beginning of the interval when the values for an event have exceeded the threshold. END The event entry marks the end of the interval when the values for an event have exceeded the threshold. CANCELD The event entry marks when the event was removed because the data used to evaluate the event is now longer being collected. EXPIRED The event entry marks when the event __________has_exprired.________________________ EventKey A hex value identifying an event for a node. For instance, all HINTER events for a node have the same value. Each time the HINTER event is signaled for a node, the value will be the same, making it easy to search for all the HINTER events for a node. EventID A hex value identifying an individual event. For instance, if the HICOMQ event on node SAM is signaled, the BEGIN and END/CANCELD/EXPIRED entries that mark when the event was signaled and cancelled will have the same value. The next time the HICOMQ event is signaled on node SAM, the hex value will be different. This value makes it easy to find the entry that ____________signals_when_the_event_has_been_cancelled._____ 6 1.9 New STATUS Command Option in SYS$STARTUP:AMDS$STARTUP Command Procedure The STATUS command option shows the current state of the Data Collector; for example, RMDRIVER STOPPED and RMDRIVER STARTED. To display help on the output of this new option, enter the following command: $ SYS$SYSTARTUP:AMDS$STARTUP HELP STATUS Examples of a STATUS command and output messages are the following: $ @SYS$STARTUP:AMDS$STARTUP STATUS Current status of AM/DECamds Data Collector device RMA0: RMA0: is started and is ready to accept requests for data and can accept a connection from a Data Analyzer RMA0: is set to log data requests generating security violations to OPCOM (AMDS$RM_OPCOM_READ is set to TRUE) RMA0: is set to log fixes generating security violations to OPCOM (AMDS$RM_OPCOM_WRITE is set to TRUE) RMA0: is using VCI for network communications 1.10 New LOG Command Option in SYS$STARTUP:AMDS$STARTUP Command for START and RESTART Procedure The START and RESTART command options support the LOG option. This option displays the configuration data that is loaded into the Data Collector. This option is useful to confirm the actual data gathered and loaded. An example of using the LOG option follows: $ SYS$SYSTARTUP:AMDS$STARTUP START LOG 1.11 Windows Availability Manager Software Can Use an FDDI Network Adapter You can now use an FDDI network adapter to communicate with the network on a Windows version of the Availability Manager Data Analyzer. (Note that the OpenVMS Availability Manager software can already use an FDDI network adapter.) 7 1.12 Lock Contention Filter Page Now Allows Entry in Hex The Lock Contention Filter page has been enhanced to display any entry in hex and allows entering hex values for an entry. There are 16 hex fields for the first 16 bytes of a resource name. A blank field indicates that there is no value for that byte. The ability to edit all 31 characters of a resource name will be done in a future release. 1.13 Fix to Force a Disk Volume out of Mount Verify State A new fix has been added to take a disk volume that is in a mount verify state and force it into a mount verify timeout state. This fix is available by right-clicking on any disk entry in the Disk Status Summary or Disk Volume Summary of a node pane. 1.14 Fix to Force a Shadow Set Member out of a Shadow Set A new fix has been added to take a shadow set member in a mount verify state out of a shadow set. This has the same effect as the DCL command $ SET SHADOW/FORCE_REMOVAL ddcu: This fix is available by right-clicking on any shadow set member entry in the Disk Status Summary or Disk Volume Summary of a node pane. Note: To show shadow set members in these two summaries, you need to bring up the Disk Status filter and set the Transaction count to zero. 1.15 Data Analyzer Supports Change of MAC Address on Data Collector System The Data Analyzer now supports a change of MAC address from an OpenVMS system. The MAC address of a system may change if certain network protocol stacks are started or modified, and is usually seen if the $ @SYS$STARTUP:AMDS$STARTUP START command is executed early in the system boot sequence. If this occurred in previous releases of the Availability Manager, the usual behavior is to display the system twice in the System Overview window - one entry is color-coded black, and the other is yellow, and no data is collected for the system. The following events are affected by this change. 8 Table_5_Events_Affected_by_MAC_Address_Feature_____________ Event_____Description______________________________________ CFGDON Current MAC address from system is added to the event text PTHLST Last known MAC address from system is added to the event text CHGMAC New event showing the current MAC address the system is using NEWMAC New event showing that a new MAC address is __________recorded_for_the_system__________________________ 1.16 Changed Disk Status Filter Mount Count to Display Shadow Set Members The default mount count in the Disk Status filter has been set from one to zero. This setting allows shadow set members to be displayed in the Disk Status Summary display. This was done in conjunction with the shadow set member fix in Section 1.14. 1.17 Changed I/O Filter Open File Count Default The default open file count in the I/O filter has been set from three to one. The default setting of three open files is from the DECamds application, but often filters out processes doing I/O simply because only one or two files are open for the process. 1.18 Starting the Data Analyzer or Data Server on OpenVMS Checks for Necessary Logicals A check for all the necessary logical names has been added to the startup of the Data Analyzer or Data Server. This ensures that the Data Analyzer and Data Server can access various files needed for a correct startup. If an error message appears saying that all the logical names aren't defined, make sure to compare your SYS$MANAGER:AMDS$LOGICALS.COM file with the file SYS$MANAGER:AMDS$LOGICALS.TEMPLATE. 9 1.19 Data Analyzer Logs Now Incorporate Data and Time in File Name Data Analyzer and Data Server log files now incorporate the date and time in the file name. This helps to show when the file was created, and on Windows systems where the file system doesn't save versions of a file, the older log files are not deleted when newer versions are created.s are created. File names are now in the for *_YYYYMMDD-HHMM.*. 1.20 Default Hello Multicast Interval Changes The default and secondary intervals in AMDS$LOGICALS.COM have been changed from 30 and 90 to 10 and 15 seconds respectively. The older defaults were appropriate for older, slower networks of a number of years ago. The change for more modern networks allows the Data Analyzer to discover OpenVMS systems more quickly. This change is made in the file SYS$MANAGER:AMDS$LOGICALS.TEMPLATE. Carry this change into your SYS$MANAGER:AMDS$LOGICALS.COM file to change the Data Collector behavior. 1.21 Option to Increase OpenVMS Data Analyzer Performance When Displayed on a Remote System The logical AMDS$AM_DISABLE_OFFSCREEN_PIXMAP_SUPPORT controls whether or not the Java Virtual Machine uses offscreen pixmap support. Disabling this support can increase performance, especially when the Data Analyzer display is on a remote system using TCP/IP. If the performance of the OpenVMS Data Analyzer in this situation is slow, you can try setting this logical to TRUE to see if this helps in your situation. This change is made in the file SYS$MANAGER:AMDS$LOGICALS.TEMPLATE. Carry this change into your SYS$MANAGER:AMDS$LOGICALS.COM file to enable use of the AMDS$AM_DISABLE_OFFSCREEN_PIXMAP_SUPPORT logical. 2 Installation Notes The notes in this section are related to the preinstallation and installation of the Availability Manager software. 10 2.1 Check Date of AMNDIS50.SYS on Windows Systems After Installing Version 3.0-2 After installing the Version 3.0-2 Windows kit, check the date of the AMNDIS50.SYS file. This file is on the Windows system disk, usually in C:\WINDOWS\system32\drivers. The date should be November 28, 2006. If the date is earlier than this, you need to uninstall the Availability Manager software, and then reinstall it. Make sure to reboot your system when you are prompted to do so. This usually fixes the problem. The problem is being investigated. It is important to have the latest AMNDIS50.SYS file as it has fixes in it to prevent some system crashes when two Data Analyzers are started at the same time. 2.2 Uninstall Data Analyzer Kits Before Installing the Combined Kit for Version 3.0-2 Before installing the Version 3.0-2 AVAIL_MAN_ANA_SRVR kit, you must uninstall the old AVAIL_MAN_ANA kit. You cannot install the new kit over the old one because of restrictions in the $ PRODUCT INSTALL command. (Note that the Data Analyzer kit, ANA_MAN_ANA in previous versions, has been renamed to AVAIL_MAN_ANA_SRVR in Version 3.0-2; this new name better reflects the Data Analyzer and Data Server combination.) 2.3 Uninstall Versions of the Availability Manager Prior to 2.4 Before Installing the Version 3.0-2 Kit On both Windows and OpenVMS systems, check the following list to see if any item applies to you. If so, follow the instructions in the appropriate section before installing Version 3.0-2: o On Windows systems, you must first uninstall Versions 2.3 and lower kits. On Versions 2.4 and higher, you can install new kits over prior kits. o On OpenVMS systems, perform one of the following steps: - If you have never installed the Availability Manager on your system, you can install Version 3.0-2 directly. 11 - If you have installed a version of the Availability Manager prior to Version 2.4 and you are running OpenVMS Version 6.2 through Version 7.3-1 or its variants, you must perform the following steps: a. Uninstall the previous version of the Availability Manager. b. Install the Availability Manager Data Collector Version 2.4 kit. c. Install the Version 3.0-2 Combined kit. - If you have installed a version of the Availability Manager prior to Version 2.4 and you are running OpenVMS Version 7.3 or higher, you must perform the following steps: a. Uninstall the previous version of the Availability Manager. b. Install the Version 3.0-2 Combined kit. These requirements are explained in the Version 3.0-2 installation instructions. 2.4 Installing from an ODS-5 Disk If you install the Version 3.0-2 kit from an ODS-5 disk, the file name for the kit must be in all-capital letters for the kit to be installed correctly. 2.5 Copy Your AVAILMAN.INI File Prior to installation, you might want to make a copy of your AVAILMAN.INI file to save your customizations such as event threshold settings and the groups you usually monitor. On Windows systems, also delete any desktop shortcuts for previous versions of the Availability Manager because they will be invalid once the new version is installed. 3 Problems Corrected The following sections discuss problems corrected in Version 3.0-2 of the Availability Manager. 12 3.1 Alignment Faults in the Data Collector Have Been Fixed The remaining alignment faults in the Data Collector (SYS$RMDRIVER.EXE) have been found and fixed. 3.2 Password Encryption Now Enabled by Default The password encryption feature introduced in Version 2.5 was not enabled as expected. This meant that the password for the Data Collector was passed in plain text. This problem has now been corrected. 3.3 Problem Starting Data Analyzer When DCL Extended Parsing Is On On OpenVMS systems, when the DCL parsing style was set to extended ($ SET PROCESS/PARSE_STYLE=EXTENDED), starting the Data Analyzer resulted in an unrecognized option error: Unrecognized option: -CP Could not create the Java virtual machine. This problem has been corrected. 3.4 Sizing of Displays Corrected The updated Data Analyzer now sizes and displays tables and sections of a window correctly-according to the size of the font. The Lock Contention and Cluster displays are now more readable as well. These changes are especially evident on OpenVMS systems. 3.5 Data Analyzer Performance Improvements The reworking of the Data Analyzer to initiate and use connections to the Data Server also made processing of data more efficient. The result has been improved performance in the Data Analyzer. 3.6 Lock Contention Data Collection Performance Improvements For systems with large resource hash tables, the data collection for lock contention have been improved, and result in less network traffic. 13 3.7 Starting the Version 2.6 Data Analyzer Would Fail on Some I64 Systems Starting the Data Analyzer Version 2.6 would result in a Java VM stack dump on some I64 systems that use certain versions of the Montecito chip set. The output looks similar to the example below: I64VMS> avail/avail %SYSTEM-F-ACCVIO, access violation, reason mask=00, virtual address=000000000000 0454, PC=00000000002975D0, PS=0000001B %TRACE-F-TRACEBACK, symbolic stack dump follows image module routine line rel PC abs PC JAVA$HOTSPOT_SHR IA64_SPECS ia64_get_issue_port 9950 0000000000000060 00000000002975D0 JAVA$HOTSPOT_SHR IPACK ipack_generate_nop_bits 22580 0000000000000EA0 0000000000298C00 JAVA$HOTSPOT_SHR TL_HS ipack_init_nops 22215 0000000000000C10 00000000002A2220 ... This problem was in earlier 1.4 versions of the Java Virtual Machine. This has been fixed since Version 3.0-2 of the Availability Manager uses a 1.5 version of the Java Virtual Machine. 3.8 System Uptime Value Wraparound Fixed For systems up over 500 days, the system uptime wrap around and continue counting from a zero value. This has been fixed. 3.9 HIHRDP Event Changed to Use Hard Page Fault Rate The documentation for the HIHRDP event states that the hard page fault rate is compared to the threshold. However, the read I/O rate was used instead. This has been fixed to use the hard page fault rate as advertised in the documentation. 3.10 Customization of Threshold Values Would Fail with "ERR" Status On various occasions, trying to change a threshold value for an event would result in a red "ERR" status in the field. This has been fixed. 14 3.11 Single Process Pane Now Displays Process Type Correctly In previous versions of the Data Analyzer, "(DETACHED)" process type was displayed for all processes, regardless of type. This has been fixed. 4 Operation Notes/Restrictions The following sections contain notes pertaining to the operation of the Availability Manager Version 3.0-2. The sections are subdivided into field test versions and a final section that contains notes applying to all field test versions. 4.1 Notes for Version 3.0-2 The following sections contain notes that apply the Availability Manager Version 3.0-2. 4.1.1 Changing the Number of the Port the Data Server Uses The default port number for the Data Server is 9819. If you want to use a different port number, do one of the following: o For Windows systems, modify the Data Server startup shortcut. Right-click the shortcut, and choose Properties. In the "Target:" field, find the number 9819. Change this number to the port you want. o For OpenVMS systems, use the /PORT_NUMBER qualifier to specify a different port number; for example: $ AVAIL/SERVER/PORT_NUMBER=5498 4.1.2 Changing the Network Adapter the Data Server Uses on Windows Systems On Windows systems, the Data Server picks up the first available network adapter for data communications. If it selects an undesirable network adapter-for example, a wireless adapter on a laptop-you can disable this selection by unchecking "HP Availability Manager NDIS 5.0 Protocol Driver" in the Network Properties for the network adapter. 15 4.1.3 Stopping the Data Server You can stop the Data Server in one of the following ways: o On Windows systems, close the Windows console window where the Data Server is running. o On OpenVMS sysetms, Ctrl/Y stops the Data Server in interactive processes. Stopping the process stops the Data Server for batch processes. 4.1.4 Data Server Uses Only First Network Adapter This version of the Data Server on Windows selects the first network adapter that it discovers when it starts up. The network adapter it chooses is displayed in a log message on the screen. Plans for a future release include the ability to select the network adapter of your choice. 4.1.5 Starting the Data Server Might Trigger Windows Security Alert On Windows systems, a security alert from the Windows firewall or third-party firewall might be displayed for the Java 2 Platform Standard Edition binary. The Data Server needs to accept connections as part of its normal operation. Tell the firewall to "Unblock" the program. 4.2 Notes for All Versions The following sections contain notes that apply to all versions of the Availability Manager. 4.2.1 Running Reflective Memory by GE Fanuc and Availability Manager The Reflective Memory product by GE Fanuc sets up the device RMA0: as part of its normal operation. Because the Availability Manager Data Collector also creates the device RMA0:, both products cannot run on the same node at the same time. 4.2.2 Administrator Account Required to Run the Availability Manager On Windows 2000 and Windows XP platforms, you must run the Data Analyzer or Data Server from an account in the Administrator group. This restriction will be removed in a future release of the Availability Manager. 16 4.2.3 Problem Displaying Large Numbers of Processes or Disks Very busy networks can sometimes interfere with the transfer of data between the Data Analyzer and the Data Collector. This problem is noticeable when you display large numbers of disks or processes. The number of disks or processes might change temporarily because of a lost data message. This problem will be corrected in a future release. 4.2.4 Local Administrator Account Required for Windows Installation To install the Availability Manager on a Windows system, you must use the local Administrator account. Some users have had problems when they use a Windows domain account that has Administrator privileges instead. For example, a failure message might appear saying "Failure to install AMNDIS50" after most of the installation is complete. This problem will be corrected in a future release. 5 Display Notes The following sections contain notes pertaining to the display of Data Analyzer data on all platforms and on OpenVMS systems. 5.1 Problems Using the Data Analyzer on All Platforms The following sections contains subsections pertaining to the display of the Data Analyzer on Windows and on OpenVMS platforms in Version 3.0-2. 5.1.1 Events Sometimes Displayed After Background Collection Stops The Data Analyzer sometimes displays events after users customize their systems to stop collecting a particular kind of data. This is most likely to occur when the Data Analyzer is monitoring many nodes. Under these conditions, a data handler sometimes clears events before all pending packets have been processed. The events based on the data in these packets are displayed even though users have requested that this data not be collected. 17 5.1.2 Truncated LAN Channel Summary Display On versions of OpenVMS prior to Version 7.3-1, the LAN Channel Summary display might be disabled for some OpenVMS nodes if there are more than seven channels for that virtual circuit. This problem results from a restriction in the OpenVMS Version 7.3 PEDRIVER. For this condition, the following error message is displayed: Error retrieving ChSumLAN data, error code=0x85 (Continuation data disallowed for request) This problem was corrected in the OpenVMS Version 7.3-1 PEDRIVER. 5.2 Problem Using the Data Analyzer on OpenVMS Systems: Long Runs Exhaust XLIB Resource ID On older versions of DECwindows Motif, a resource ID allocation scheme works poorly with the Motif support in Java for OpenVMS. As a result, long-running Availability Manager sessions might stop updating the display at a time that depends on the speed of the OpenVMS machine. For example, a session running on a dual-processor 275 MHz system reported the following after 14 hours: Xlib: resource ID allocation space exhausted! On faster machines, this message was reported after only 8 hours. This problem appears to be corrected in DECwindows Motif Version 1.3-1. 18