Compaq Availability Manager Version 1.3 Release Notes The following notes address late-breaking information and known problems for the Availability Manager Version 1.3. These notes appear in the following categories: o Configuration, Setup, and Installation Notes o Startup and Shutdown Notes o Operation Notes o Display Notes 1 Configuration, Setup, and Installation Notes The following notes pertain to configuring, setting up, and installing the Availability Manager. 1.1 Recommended Hardware Configurations There are no minimum hardware requirements for the Data Collector. Compaq recommends using, at a minimum, one of the following hardware configurations on systems running the Data Analyzer: ___________________________________________________________ System________Hardware_____________________________________ Windows NT 300 MHz Intel Pentium processor with 96 MB of memory Windows NT 500 MHz Alpha processor with 128 MB of memory OpenVMS_______500_MHz_Alpha_processor_with_128_MB_of_memory 1.2 Enabling and Disabling Kernel Multithreading On multiple-CPU OpenVMS systems, the logical name AMDS$AM_MULTITHREADING controls whether or not the Availability Manager runs on multiple CPUs (that is, uses kernel multithreading). This logical name is defined in the SYS$MANAGER:AMDS$LOGICALS.COM file. Setting AMDS$AM_MULTITHREADING to TRUE can improve application performance, but at the cost of application stability. See Section 3.2.3 for an example of one stability problem. 1 Setting the logical name to FALSE (the default) forces the application to run on a single CPU. For the current set of patches available on OpenVMS, this approach offers the greatest stability. Enabling and Disabling Commands To enable kernel multithreading, set the logical to TRUE: $ AMDS$DEF AMDS$AM_MULTITHREADING TRUE To disable kernel multithreading, set the logical to FALSE: $ AMDS$DEF AMDS$AM_MULTITHREADING FALSE 1.3 PCSI Installation Messages If you install DECamds Version 7.2-1A after installing Availability Manager Version 1.3, you might see any of the following PCSI messages: %PCSI-I-RETAIN, file [SYS$LDR]SYS$RMDRIVER.EXE was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, file [SYS$LDR]SYS$RMDRIVER.STB was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, file [SYS$STARTUP]AMDS$STARTUP.COM was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, file [SYS$STARTUP]AMDS$STARTUP.TEMPLATE was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, file [SYSEXE]AMDS$RMCP.EXE was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, module AVAIL was not replaced because module from kit does not have higher generation number %PCSI-I-RETAIN, file [SYSMGR]AMDS$DRIVER_ACCESS.DAT was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, file [SYSMGR]AMDS$DRIVER_ACCESS.TEMPLATE was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, file [SYSMGR]AMDS$LOGICALS.COM was not replaced because file from kit does not have higher generation number %PCSI-I-RETAIN, file [SYSMGR]AMDS$LOGICALS.TEMPLATE was not replaced because file from kit does not have higher generation number 2 These messages are to be expected because DECamds and the Availability Manager share all the files cited. 1.4 Notes on Installing the Data Analyzer on Windows NT Systems The following notes pertain to the installation of the Availability Manager Data Analyzer on Windows NT systems. 1.4.1 Windows 2000 Not Supported in Version 1.3 You cannot install or run the Availability Manager Version 1.3 on a Windows 2000 system. We expect to release a version that does run on Windows 2000 in the near future. 1.4.2 Running the Self-Extracting .EXE Multiple Times The Availability Manager software for Windows NT systems is packaged in a self-extracting executable (.EXE). On Alpha systems, if you run multiple installations of Availability Manager Version 1.3, the .EXE unpacks the installation in the same temporary folder. As a result of a duplicate installation, the system displays a message box entitled Overwrite Protection, which contains a message that "the following file is already installed on your system... Do you wish to overwrite the file?" You can ignore these messages. Click Yes to All. 1.4.3 Registry Subkey Message In some situations during an installation, the system displays the message "Registry Service Subkey already exists." You can ignore this message. 1.4.4 Self-Extracting Executable Does Not Exit In some situations, the self-extracting executable extracts the installation package but does not exit and start the installation. When this occurs, the system displays the "Unpacking" progress bar, and then nothing happens. Windows Task Manager shows the self-extracting executable as an active process, but it appears to be stalled. To activate the Availability Manager installation, press Ctrl + Alt + Delete, and then choose Cancel. The InstallShield[[R]] progress bar then appears, and the installation continues normally. 3 2 Startup and Shutdown Notes The following notes pertain to starting up and shutting down the Availability Manager. 2.1 Avoid Using Multiple Data Analyzers on the Same System If the Availability Manager is shut down improperly or abruptly on a Windows NT system, the AM_SESSION.LOCK file might not be deleted, thereby preventing subsequent sessions from starting. In this situation, when you try to start the Data Analyzer, you will see the following warning: Could not establish session lock! Another AM session may be running. Either one of the following situations might exist: o Two sessions of the Availability Manager have overlapped, and the later session has detected the lock from the earlier session. Either use the earlier session, or shut down the old session before you start the new session. o A Data Analyzer session terminated abnormally. If this occurs, follow these steps: 1. Delete the file AM_SESSION.LOCK in your installation directory. 2. Try to restart the Data Analyzer. 3. If the Data Analyzer fails to start, restart your system to clear any possible driver confusion. 2.2 Restarting After an Uninstall Operation on a Windows NT System To uninstall the Availability Manager from a Windows NT system using Add/Remove Programs on the Windows NT Control panel, follow these steps: 1. Uninstall the software. 2. Restart the system. This step completes the removal of the network bindings. 3. Optionally, reinstall the software. 4 If you omit step 2, starting the Availability Manager could cause the system to fail. To recover from this situation, restart the system and then reinstall the Availability Manager (uninstalling the software again is not necessary). Finally, restart your system at the end of the installation. The Availability Manager should run properly. 3 Operation Notes Availability Manager operation notes fall into the following categories: o General information o Known problems 3.1 General Information The notes in this section contain information about the general operation of the Availability Manager. 3.1.1 Some DECamds Features Not Yet Implemented The Availability Manager is, in most respects, a Java[R] implementation of the DECamds availability management software product. With each release, more features of DECamds are being added to the Availability Manager. However, not all features have yet been implemented in the Availability Manager. These features are planned to be added in future releases. 3.1.2 Data Collection and Events on OpenVMS Nodes Node summary data is the only data that is collected by default. The Availability Manager looks for events only in data that is being collected. You can collect additional data in either of the following ways: o Opening any display page that contains node-specific data (for example, CPU, memory, I/O) automatically starts foreground data collection and event analysis except for Lock Contention and Cluster Summary information (you must select these tabs individually to start foreground data collection.) Collection and evaluation continue as long as a page with node-specific 5 data is displayed. Refer to the nodes chapter in the manual for details. o Clicking a check mark on the Customize OpenVMS... menu Data Collection page enables background collection of that type of data. Data is collected and events are analyzed continuously until you remove the check mark. Refer to the overview and customization chapters in the manual for details. 3.1.3 Limit Your Background Collection of Detailed Data By default, the only data collected on OpenVMS nodes is node summary data. You can collect this data on many nodes without incurring performance problems. If you do not have a high-performance workstation, and you have many nodes configured, be careful about enabling more data collection on the customization Data Collection page. This is especially true when you run the Data Analyzer on OpenVMS systems. A new feature in Version 1.3 might help satisfy your data collection needs: when you open a node-specific data page, all types of data are automatically collected for that node. 3.2 Known Problems The notes in this section discuss known problems with this version of the Availability Manager. 3.2.1 Windows NT Data Collector Does Not Recognize New Disk Configurations If you change the logical disk configuration on a running Windows NT node, the Data Collector does not recognize the modified disk configuration and continues to report the previous configuration to the Data Analyzer. For the Data Collector to recognize the new disk configuration, you must stop and restart the Data Collector (PerfServ). 6 3.2.2 Problem with Daylight Saving Time Changes For some time zones, especially European ones, the time- zone logic in the Java software libraries that the client uses might disagree with the Windows NT operating system about when the shift to daylight saving time occurs. For a two-week period in early April and late October, you might see a one-hour discrepancy between the time shown in the Availability Manager client and the time of day shown by the system and the Date-Time Control panel. Also, Sun's Java classes disagree with Windows NT about whether daylight saving time even exists for Asian time zones. The Windows DateTime CP usually indicates that daylight saving time is not possible for these zones; time strings generated from the calendar classes in Java appear to recognize a daylight saving time shift. Therefore, for all time zones between eastern Europe, going east to Alaska, a one-hour discrepancy is likely from April through October. This discrepancy occurs for months at a time. For OpenVMS systems, make sure that the time zone differential logical name SYS$TIMEZONE_DIFFERENTIAL is defined correctly. 3.2.3 Occasional Application Crash for Data Analyzer on OpenVMS Systems If you are running the Data Analyzer on a multiprocessor OpenVMS system, you might encounter a "SIGBUS 10" application error. In this application error, your output window displays several hundred lines of low- level thread state. Compaq has seen this only when kernel multithreading was enabled for the process. Section 1.2 contains instructions for disabling kernel multithreading. Future patch kits for the kernel-threads subsystem on OpenVMS might solve this problem. Note that disabling kernel multithreading for the Data Analyzer does not disable application-level multithreading within the Java Virtual Machine or affect kernel multithreading for other applications on the OpenVMS system. 7 3.2.4 Incorrect Values Displayed for Some Fields on NISCA Pages The following fields show incorrect values on the NISCA data displays: o Packets Discarded Data Page: - Rcv Short Msg - Ill Seq Msg - Bad Checksum - TR DFQ Empty - TR MFQ Empty - CC MFQ Empty o Receive Data Page: - Illegal Ack o VC Closures Data Page: - SeqMsg TMO - CC DFQ Empty 3.2.5 Event Reporting Problems The following list contains known event reporting problems: o Unimplemented threshold events: LOSTVC NOPROC o Event reporting irregularities: - Some posted events may not be canceled promptly when the condition goes away. - LOVOTE and LOVLSP events are posted for every node in the cluster rather than once per cluster. 3.2.6 Data Timeout Message on Page/Swap Files Page When you display page/swap information on the Page/Swap Files page, the message "Data timeout in retrieving file name" might be displayed instead of the page/swap file name. This usually occurs when the Data Analyzer is overloading your system. 8 3.2.7 Problem with Single Process Display on OpenVMS Nodes If a process terminates while you are displaying data for it on one of the single process pages, the display does not recognize the process deletion and remains on the screen. You must manually close the single process page. 3.2.8 Spurious LOVLSP Events Generated The test for the LOVLSP event (low percentage of free blocks) computes the percent free value for volume n as the free blocks on volume n divided by the total blocks on the largest volume seen by the node. If a node has a mix of very large and very small disks, this can result in spurious LOVLSP events on small disks, regardless of the number of free blocks. Note that the "% used" bar graphs in the Volume Summary display of the Disk page are correct. This problem affects only the decision to generate or cancel an event. 3.2.9 NISCA Windows Do Not Update Automatically NISCA windows do not update automatically. If you select another NISCA display tab, the data will be updated. However, automatic updating will not restart. This problem will be resolved in the next release. 3.2.10 Incorrect Identification of Group for a Node If you are monitoring an OpenVMS node that belongs to a particular group and that node is restarted with a different group association, the Availability Manager does not track the group change completely. Although the Node pane displays the node in the correct group, the Event pane might show events for the node that mention the previous group. This problem will be resolved in the next release. 4 Display Notes The following notes pertain to the display of data on Availability Manager pages. 9 4.1 Differences in Length of Image Name Displayed on Single Process Information Page Previous versions of the OpenVMS Data Collector (RM Driver) collected only the first 71 bytes of the image name for a process. This data is displayed in the Single Process Information page of the Availability Manager and the Single Process Summary Window of DECamds. Viewing either page on a node with a previous version of the Data Collector results in the image name being truncated at 71 bytes. The Data Collector that ships with Availability Manager Version 1.3 returns all the bytes of the image name for a process that are stored in OpenVMS. Viewing the Single Process Information page with the Availability Manager on a node with Version 1.3 of the Data Collector shows all the bytes returned by the Data Collector. DECamds shows only the first 71 bytes of the image name. 4.2 Hardware Model Sometimes Not Displayed on Node Summary Page For some long hardware model names, the Node Summary page hides most of the model name. On OpenVMS nodes, you can force the page to reveal the name by clicking the portion of the name that is visible and scrolling right. This problem will be resolved in the next release. 4.3 Problem Displaying Help in Some Browsers on Windows NT The following problems have been observed when using Version 4.7 of Netscape and some versions of Internet Explorer: o When you select Help, Windows NT presents a misleading error message, and the Help page is not shown. o When you select Help, the Help page is eventually displayed, but Windows NT presents a misleading error message anyway. These problems have not been seen with Netscape Version 4.5; Compaq has not tested other versions of Netscape. 10 4.4 Incomplete Repainting of Windows If you obscure part of an Availability Manager window with another window, the obscured portion of the Availability Manager window might not repaint completely when you move the top window. This appears to be a Java Swing problem that is currently under investigation. 11