HP OpenVMS Systems Documentation

HP OpenVMS Cluster Systems

Contents

Index

C.10.4 Logged Message Entries

Logged-message entries are made when the LAN port receives a response that contains either data that the port driver cannot interpret or an error code in status field of the response.

C.10.5 Error-Log Entry Descriptions

This section describes error-log entries for the CI and LAN ports. Each entry shown is followed by a brief description of what the associated port driver (for example, PADRIVER, PBDRIVER, PEDRIVER) does, and the suggested action a system manager should take. In cases where you are advised to contact your HP support representative. and save crash dumps, it is important to capture the crash dumps as soon as possible after the error. For CI entries, note that path A and path 0 are the same path, and that path B and path 1 are the same path.

Table C-6 lists error-log messages.

**Table C-6 Port Messages for All Devices**
Message	Result	User Action
BIIC FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative.
11/750 CPU MICROCODE NOT ADEQUATE FOR PORT	The port driver sets the port off line with no retries attempted. In addition, if this port is needed because the computer is booted from an HSC subsystem or is participating in a cluster, the computer bugchecks with a UCODEREV code bugcheck.	Read the appropriate section in the current OpenVMS Cluster Software SPD for information on required computer microcode revisions. Contact your HP support representative, if necessary.
PORT MICROCODE REV NOT CURRENT, BUT SUPPORTED	The port driver detected that the microcode is not at the current level, but the port driver will continue normally. This error is logged as a warning only.	Contact your HP support representative when it is convenient to have the microcode updated.
DATAGRAM FREE QUEUE INSERT FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. This error is caused by a failure to obtain access to an interlocked queue. Possible sources of the problem are CI hardware failures, or memory, SBI (11/780), CMI (11/750), or BI (8200, 8300, and 8800) contention.
DATAGRAM FREE QUEUE REMOVE FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. This error is caused by a failure to obtain access to an interlocked queue. Possible sources of the problem are CI hardware failures, or memory, SBI (11/780), CMI (11/750), or BI (8200, 8300, and 8800) contention.
FAILED TO LOCATE PORT MICROCODE IMAGE	The port driver marks the device off line and makes no retries.	Make sure console volume contains the microcode file CI780.BIN (for the CI780, CI750, or CIBCI) or the microcode file CIBCA.BIN for the CIBCA--AA. Then reboot the computer.
HIGH PRIORITY COMMAND QUEUE INSERT FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. This error is caused by a failure to obtain access to an interlocked queue. Possible sources of the problem are CI hardware failures, or memory, SBI (11/780), CMI (11/750), or BI (8200, 8300, and 8800) contention.
MSCP ERROR LOGGING DATAGRAM RECEIVED	On receipt of an error message from the HSC subsystem, the port driver logs the error and takes no other action. You should disable the sending of HSC informational error-log datagrams with the appropriate HSC console command because such datagrams take considerable space in the error-log data file.	Error-log datagrams are useful to read only if they are not captured on the HSC console for some reason (for example, if the HSC console ran out of paper.) This logged information duplicates messages logged on the HSC console.
INAPPROPRIATE SCA CONTROL MESSAGE	The port driver closes the port-to-port virtual circuit to the remote port.	Contact your HP support representative. Save the error logs and the crash dumps from the local and remote computers.
INSUFFICIENT NON-PAGED POOL FOR INITIALIZATION	The port driver marks the device off line and makes no retries.	Reboot the computer with a larger value for NPAGEDYN or NPAGEVIR.
LOW PRIORITY CMD QUEUE INSERT FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. This error is caused by a failure to obtain access to an interlocked queue. Possible sources of the problem are CI hardware failures, or memory, SBI (11/780), CMI (11/750), or BI (8200, 8300, and 8800) contention.
MESSAGE FREE QUEUE INSERT FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. This error is caused by a failure to obtain access to an interlocked queue. Possible sources of the problem are CI hardware failures, or memory, SBI (11/780), CMI (11/750), or BI (8200, 8300, and 8800) contention.
MESSAGE FREE QUEUE REMOVE FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. This error is caused by a failure to obtain access to an interlocked queue. Possible sources of the problem are CI hardware failures, or memory, SBI (11/780), CMI (11/750), or BI (8200, 8300, and 8800) contention.
MICRO-CODE VERIFICATION ERROR	The port driver detected an error while reading the microcode that it just loaded into the port. The driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative.
NO PATH-BLOCK DURING VIRTUAL CIRCUIT CLOSE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. Save the error log and a crash dump from the local computer.
NO TRANSITION FROM UNINITIALIZED TO DISABLED	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative.
PORT ERROR BIT(S) SET	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	A maintenance timer expiration bit may mean that the PASTIMOUT system parameter is set too low and should be increased, especially if the local computer is running privileged user-written software. For all other bits, call your HP support representative.
PORT HAS CLOSED VIRTUAL CIRCUIT	The port driver closed the virtual circuit that the local port opened to the remote port.	Check the PPD$B_STATUS field of the error-log entry for the reason the virtual circuit was closed. This error is normal if the remote computer failed or was shut down. For PEDRIVER, ignore the PPD$B_OPC field value; it is an unknown opcode. If PEDRIVER logs a large number of these errors, there may be a problem either with the LAN or with a remote system, or nonpaged pool may be insufficient on the local system.
PORT POWER DOWN	The port driver halts port operations and then waits for power to return to the port hardware.	Restore power to the port hardware.
PORT POWER UP	The port driver reinitializes the port and restarts port operations.	No action needed.
RECEIVED CONNECT WITHOUT PATH-BLOCK	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. Save the error log and a crash dump from the local computer.
REMOTE SYSTEM CONFLICTS WITH KNOWN SYSTEM	The configuration poller discovered a remote computer with SCSSYSTEMID and/or SCSNODE equal to that of another computer to which a virtual circuit is already open.	Shut down the new computer as soon as possible. Reboot it with a unique SCSYSTEMID and SCSNODE. Do not leave the new computer up any longer than necessary. If you are running a cluster, and two computers with conflicting identity are polling when any other virtual circuit failure takes place in the cluster, then computers in the cluster may shut down with a CLUEXIT bugcheck.
RESPONSE QUEUE REMOVE FAILURE	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative. This error is caused by a failure to obtain access to an interlocked queue. Possible sources of the problem are CI hardware failures, or memory, SBI (11/780), CMI (11/750), or BI (8200, 8300, and 8800) contention.
SCSSYSTEMID MUST BE SET TO NON-ZERO VALUE	The port driver sets the port off line without attempting any retries.	Reboot the computer with a conversational boot and set the SCSSYSTEMID to the correct value. At the same time, check that SCSNODE has been set to the correct nonblank value.
SOFTWARE IS CLOSING VIRTUAL CIRCUIT	The port driver closes the virtual circuit to the remote port.	Check error-log entries for the cause of the virtual circuit closure. Faulty transmission or reception on both paths, for example, causes this error and may be detected from the one or two previous error-log entries noting bad paths to this remote computer.
SOFTWARE SHUTTING DOWN PORT	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Check other error-log entries for the possible cause of the port reinitialization failure.
UNEXPECTED INTERRUPT	The port driver attempts to reinitialize the port; after 50 failed attempts, it marks the device off line.	Contact your HP support representative.
UNRECOGNIZED SCA PACKET	The port driver closes the virtual circuit to the remote port. If the virtual circuit is already closed, the port driver inhibits datagram reception from the remote port.	Contact your HP support representative. Save the error-log file that contains this entry and the crash dumps from both the local and remote computers.
VIRTUAL CIRCUIT TIMEOUT	The port driver closes the virtual circuit that the local CI port opened to the remote port. This closure occurs if the remote computer is running CI microcode Version 7 or later, and if the remote computer has failed to respond to any messages sent by the local computer.	This error is normal if the remote computer has halted, failed, or was shut down. This error may mean that the local computer's TIMVCFAIL system parameter is set too low, especially if the remote computer is running privileged user-written software.
INSUFFICIENT NON-PAGED POOL FOR VIRTUAL CIRCUITS	The port driver closes virtual circuits because of insufficient pool.	Enter the DCL command SHOW MEMORY to determine pool requirements, and then adjust the appropriate system parameter requirements.

The descriptions in Table C-7 apply only to LAN devices.

**Table C-7 Port Messages for LAN Devices**
Message	Completion Status	Explanation	User Action
FATAL ERROR DETECTED BY DATALINK	First longword SS$_NORMAL (00000001), second longword (00001201)	The LAN driver stopped the local area OpenVMS Cluster protocol on the device. This completion status is returned when the SYS$LAVC_STOP_BUS routine completes successfully. The SYS$LAVC_STOP_BUS routine is called either from within the LAVC$STOP_BUS.MAR program found in SYS$EXAMPLES or from a user-written program. The local area OpenVMS Cluster protocol remains stopped on the specified device until the SYS$LAVC_START_BUS routine executes successfully. The SYS$LAVC_START_BUS routine is called from within the LAVC$START_BUS.MAR program found in SYS$EXAMPLES or from a user-written program.	If the protocol on the device was stopped inadvertently, then restart the protocol by assembling and executing the LAVC$START_BUS program found in SYS$EXAMPLES. Reference: See Appendix D for an explanation of the local area OpenVMS Cluster sample programs. Otherwise, this error message can be safely ignored.
	First longword is any value other than (00000001), second longword (00001201)	The LAN driver has shut down the device because of a fatal error and is returning all outstanding transmits with SS$_OPINCOMPL. The LAN device is restarted automatically.	Infrequent occurrences of this error are typically not a problem. If the error occurs frequently or is accompanied by loss or reestablishment of connections to remote computers, there may be a hardware problem. Check for the proper LAN adapter revision level or contact your HP support representative.
	First longword (undefined), second longword (00001200)	The LAN driver has restarted the device successfully after a fatal error. This error-log message is usually preceded by a FATAL ERROR DETECTED BY DATALINK error-log message whose first completion status longword is anything other than 00000001 and whose second completion status longword is 00001201.	No action needed.
TRANSMIT ERROR FROM DATALINK	SS$_OPINCOMPL (000002D4)	The LAN driver is in the process of restarting the data link because an error forced the driver to shut down the controller and all users (see FATAL ERROR DETECTED BY DATALINK).
	SS$_DEVREQERR (00000334)	The LAN controller tried to transmit the packet 16 times and failed because of defers and collisions. This condition indicates that LAN traffic is heavy.
	SS$_DISCONNECT (0000204C)	There was a loss of carrier during or after the transmit. This includes transmit attempts when the link is down.	The port emulator automatically recovers from any of these errors, but many such errors indicate either that the LAN controller is faulty or that the LAN is overloaded. If you suspect either of these conditions, contact your HP support representative.
INVALID CLUSTER PASSWORD RECEIVED		A computer is trying to join the cluster using the correct cluster group number for this cluster but an invalid password. The port emulator discards the message. The probable cause is that another cluster on the LAN is using the same cluster group number.	Provide all clusters on the same LAN with unique cluster group numbers.
NISCS PROTOCOL VERSION MISMATCH RECEIVED		A computer is trying to join the cluster using a version of the cluster LAN protocol that is incompatible with the one in use on this cluster.	Install a version of the operating system that uses a compatible protocol, or change the cluster group number so that the computer joins a different cluster.

C.11 OPA0 Error-Message Logging and Broadcasting

Port drivers detect certain error conditions and attempt to log them. The port driver attempts both OPA0 error broadcasting and standard error logging under any of the following circumstances:

The system disk has not yet been mounted.
The system disk is undergoing mount verification.
During mount verification, the system disk drive contains the wrong volume.
Mount verification for the system disk has timed out.
The local computer is participating in a cluster, and quorum has been lost.

Note the implicit assumption that the system and error-logging devices are one and the same.

The following table describes error-logging methods and their reliability.

Method	Reliability	Comments
Standard error logging to an error-logging device.	Under some circumstances, attempts to log errors to the error-logging device can fail. Such failures can occur because the error-logging device is not accessible when attempts are made to log the error condition.	Because of the central role that the port device plays in clusters, the loss of error-logged information in such cases makes it difficult to diagnose and fix problems.
Broadcasting selected information about the error condition to OPA0. (This is in addition to the port driver's attempt to log the error condition to the error-logging device.)	This method of reporting errors is not entirely reliable, because some error conditions may not be reported due to the way OPA0 error broadcasting is performed. This situation occurs whenever a second error condition is detected before the port driver has been able to broadcast the first error condition to OPA0. In such a case, only the first error condition is reported to OPA0, because that condition is deemed to be the more important one.	This second, redundant method of error logging captures at least some of the information about port-device error conditions that would otherwise be lost.

Note: Certain error conditions are always broadcast to OPA0, regardless of whether the error-logging device is accessible. In general, these are errors that cause the port to shut down either permanently or temporarily.

C.11.1 OPA0 Error Messages

One OPA0 error message for each error condition is always logged. The text of each error message is similar to the text in the summary displayed by formatting the corresponding standard error-log entry using the Error Log utility. (See Section C.10.5 for a list of Error Log utility summary messages and their explanations.)

Table C-8 lists the OPA0 error messages. The table is divided into units by error type. Many of the OPA0 error messages contain some optional information, such as the remote port number, CI packet information (flags, port operation code, response status, and port number fields), or specific CI port registers. The codes specify whether the message is always logged on OPA0 or is logged only when the system device is inaccessible.

**Table C-8 OPA0 Messages**
Error Message	Logged or Inaccessible
Software Errors During Initialization
%PEA0, Configuration data for IP cluster not found	Logged
%Pxxn, Insufficient Non-Paged Pool for Initialization	Logged
%Pxxn, Failed to Locate Port Micro-code Image	Logged
%Pxxn, SCSSYSTEMID has NOT been set to a Non-Zero Value	Logged
Hardware Errors
%Pxxn, BIIC failure---BICSR/BER/CNF xxxxxx/xxxxxx/xxxxxx	Logged
%Pxxn, Micro-code Verification Error	Logged
%Pxxn, Port Transition Failure---CNF/PMC/PSR xxxxxx/xxxxxx/xxxxxx	Logged
%Pxxn, Port Error Bit(s) Set---CNF/PMC/PSR xxxxxx/xxxxxx/xxxxxx	Logged
%Pxxn, Port Power Down	Logged
%Pxxn, Port Power Up	Logged
%Pxxn, Unexpected Interrupt---CNF/PMC/PSR xxxxxx/xxxxxx/xxxxxx	Logged
Queue Interlock Failures
%Pxxn, Message Free Queue Remove Failure	Logged
%Pxxn, Datagram Free Queue Remove Failure	Logged
%Pxxn, Response Queue Remove Failure	Logged
%Pxxn, High Priority Command Queue Insert Failure	Logged
%Pxxn, Low Priority Command Queue Insert Failure	Logged
%Pxxn, Message Free Queue Insert Failure	Logged
%Pxxn, Datagram Free Queue Insert Failure	Logged
Cable Change-of-State Notification
%Pxxn, Path #0. Has gone from GOOD to BAD---REMOTE PORT ¹ xxx	Inaccessible
%Pxxn, Path #1. Has gone from GOOD to BAD---REMOTE PORT ¹ xxx	Inaccessible
%Pxxn, Path #0. Has gone from BAD to GOOD---REMOTE PORT ¹ xxx	Inaccessible
%Pxxn, Path #1. Has gone from BAD to GOOD---REMOTE PORT ¹ xxx	Inaccessible
%Pxxn, Cables have gone from UNCROSSED to CROSSED---REMOTE PORT ¹ xxx	Inaccessible
%Pxxn, Cables have gone from CROSSED to UNCROSSED---REMOTE PORT ¹ xxx	Inaccessible
%Pxxn, Path #0. Loopback has gone from GOOD to BAD---REMOTE PORT ¹ xxx	Logged
%Pxxn, Path #1. Loopback has gone from GOOD to BAD---REMOTE PORT ¹ xxx	Logged
%Pxxn, Path #0. Loopback has gone from BAD to GOOD---REMOTE PORT ¹ xxx	Logged
%Pxxn, Path #1. Loopback has gone from BAD to GOOD---REMOTE PORT ¹ xxx	Logged
%Pxxn, Path #0. Has become working but CROSSED to Path #1.--- REMOTE PORT ¹ xxx	Inaccessible
%Pxxn, Path #1. Has become working but CROSSED to Path #0.--- REMOTE PORT ¹ xxx	Inaccessible

¹If the port driver can identify the remote SCS node name of the affected computer, the driver replaces the "REMOTE PORT xxx" text with "REMOTE SYSTEM X...", where X... is the value of the system parameter SCSNODE on the remote computer. If the remote SCS node name is not available, the port driver uses the existing message format.

Key to CI Port Registers:

CNF---configuration register
PMC---port maintenance and control register
PSR---port status register

See also the CI hardware documentation for a detailed description of the CI port registers.

Contents

Index