|
HP OpenVMS Cluster Systems
F.3.4 Monitoring PEDRIVER for LAN devices
The SDA command PE LAN_DEVICE is useful for displaying PEDRIVER LAN
device data. Each LAN device is a local LAN device on the system being
used for NISCACP communications.
In the following example PE LAN_DEVICE displays the LAN device summary
of I64MOZ
Example F-3 SDA Command PE LAN_DEVICE |
SDA> PE LAN_DEVICE
PE$SDA Extension on I64MOZ (HP rx4640 (1.50GHz/6.0MB)) at 21-NOV-2008 15:43:12.53
----------------------------------------------------------------------------------
I64MOZ Device Summary 21-NOV-2008 15:43:12.53:
Device Line Buffer MgtBuf Load Mgt Current Total Errors &
Device Type Speed Size SizeCap Class Priority LAN Address Bytes Events Status
------ ---- ----- ---- ------- ----- -------- ----------- ----- ------ ------
LCL 0 1426 0 0 0 00-00-00-00-00-00 31126556 0 Run Online Local Restart
EIA 100 1426 0 1000 0 00-30-6E-5D-97-AE 5086238 2 Run Online Restart
EIB 1000 1426 0 1000 0 00-30-6E-5D-97-AF 0 229120 Run Online Restart
|
F.3.5 Monitoring PEDRIVER Buses for LAN Devices
The SDA command SHOW PORT/BUS=BUS_LAN-device command is useful
for displaying the PEDRIVER representation of a LAN adapter. To
PEDRIVER, a bus is the logical representation of the LAN adapter. (To
list the names and addresses of buses, enter the SDA command SHOW
PORT/ADDR=PE_PDT and then press the Return key twice.) Example F-4
shows a display for the LAN adapter named EXA.
Example F-4 SDA Command SHOW PORT/BUS
Display |
SDA> SHOW PORT/BUS=BUS_EXA
VAXcluster data structures
--------------------------
--- BUS: 817E02C0 (EXA) Device: EX_DEMNA LAN Address: AA-00-04-00-64-4F ---
LAN Hardware Address: 08-00-2B-2C-20-B5
Status: 00000803 run,online(1),restart
------- Transmit ------ ------- Receive ------- ---- Structure Addresses ---
Msg Xmt 20290620 Msg Rcv 67321527 PORT Address 817E1140
Mcast Msgs 1318437 Mcast Msgs 39773666 VCIB Addr 817E0478
Mcast Bytes 168759936 Mcast Bytes 159660184 HELLO Message Addr 817E0508
Bytes Xmt 2821823510 Bytes Rcv 3313602089 BYE Message Addr 817E0698
Outstand I/Os 0 Buffer Size 1424 Delete BUS Rtn Adr 80C6DA46
Xmt Errors(2) 15896 Rcv Ring Size 31
Last Xmt Error 0000005C Time of Last Xmt Error(3)21-JAN-1994 15:33:38.96
--- Receive Errors ---- ------ BUS Timer ------ ----- Datalink Events ------
TR Mcast Rcv 0 Handshake TMO 80C6F070 Last 7-DEC-1992 17:15:42.18
Rcv Bad SCSID 0 Listen TMO 80C6F074 Last Event 00001202
Rcv Short Msg 0 HELLO timer 3 Port Usable 1
Fail CH Alloc 0 HELLO Xmt err(4) 1623 Port Unusable 0
Fail VC Alloc 0 Address Change 1
Wrong PORT 0 Port Restart Fail 0
|
Field |
Description |
(1) Status:
|
The Status line should always display a status of "online" to
indicate that PEDRIVER can access its LAN adapter.
|
(2) Xmt Errors (transmission errors)
|
Indicates the number of times PEDRIVER has been unable to transmit a
packet using this LAN adapter.
|
(3) Time of Last Xmt Error
|
You can compare the time shown in this field with the Open and Cls
times shown in the VC display in Example F-2 to determine whether the
time of the LAN adapter failure is close to the time of a virtual
circuit failure.
Note: Transmission errors at the LAN adapter bus level
cause a virtual circuit breakage.
|
(4) HELLO Xmt err (HELLO transmission error)
|
Indicates how many times a message transmission failure has
"dropped" a PEDRIVER HELLO datagram message. (The Channel
Control [CC] level description in Section F.1 briefly describes the
purpose of HELLO datagram messages.) If many HELLO transmission errors
occur, PEDRIVER on other nodes probably is timing out a channel, which
could eventually result in closure of the virtual circuit.
The 1623 HELLO transmission failures shown in Example F-4
contributed to the high number of transmission errors (15896). Note
that it is impossible to have a low number of transmission errors and a
high number of HELLO transmission errors.
|
F.3.6 Monitoring LAN Adapters
Use the SDA command SHOW LAN/COUNT to display information about the LAN
adapters as maintained by the LAN device driver (the command shows
counters for all protocols, not just PEDRIVER [SCA] related counters).
Example F-5 shows a sample display from the SHOW LAN/COUNTERS command.
Example F-5 SDA Command SHOW LAN/COUNTERS
Display |
$ ANALYZE/SYSTEM
SDA> SHOW LAN/COUNTERS
LAN Data Structures
-------------------
-- EXA Counters Information 22-JAN-1994 11:21:19 --
Seconds since zeroed 3953329 Station failures 0
Octets received 13962888501 Octets sent 11978817384
PDUs received 121899287 PDUs sent 76872280
Mcast octets received 7494809802 Mcast octets sent 183142023
Mcast PDUs received 58046934 Mcast PDUs sent 1658028
Unrec indiv dest PDUs 0 PDUs sent, deferred 4608431
Unrec mcast dest PDUs 0 PDUs sent, one coll 3099649
Data overruns 2 PDUs sent, mul coll 2439257
Unavail station buffs(1) 0 Excessive collisions(2) 5059
Unavail user buffers 0 Carrier check failure 0
Frame check errors 483 Short circuit failure 0
Alignment errors 10215 Open circuit failure 0
Frames too long 142 Transmits too long 0
Rcv data length error 0 Late collisions 14931
802E PDUs received 28546 Coll detect chk fail 0
802 PDUs received 0 Send data length err 0
Eth PDUs received 122691742 Frame size errors 0
LAN Data Structures
-------------------
-- EXA Internal Counters Information 22-JAN-1994 11:22:28 --
Internal counters address 80C58257 Internal counters size 24
Number of ports 0 Global page transmits 0
No work transmits 3303771 SVAPTE/BOFF transmits 0
Bad PTE transmits 0 Buffer_Adr transmits 0
Fatal error count 0 RDL errors 0
Transmit timeouts 0 Last fatal error None
Restart failures 0 Prev fatal error None
Power failures 0 Last error CSR 00000000
Hardware errors 0 Fatal error code None
Control timeouts 0 Prev fatal error None
Loopback sent 0 Loopback failures 0
System ID sent 0 System ID failures 0
ReqCounters sent 0 ReqCounters failures 0
-- EXA1 60-07 (SCA) Counters Information 22-JAN-1994 11:22:31 --
Last receive(3) 22-JAN 11:22:31 Last transmit(3) 22-JAN 11:22:31
Octets received 7616615830 Octets sent 2828248622
PDUs received 67375315 PDUs sent 20331888
Mcast octets received 0 Mcast octets sent 0
Mcast PDUs received 0 Mcast PDUs sent 0
Unavail user buffer 0 Last start attempt None
Last start done 7-DEC 17:12:29 Last start failed None
.
.
.
|
The SHOW LAN/COUNTERS display usually includes device counter
information about several LAN adapters. However, for purposes of
example, only one device is shown in Example F-5.
Field |
Description |
(1) Unavail station buffs (unavailable station buffers)
|
Records the number of times that fixed station buffers in the LAN
driver were unavailable for incoming packets. The node receiving a
message can lose packets when the node does not have enough LAN station
buffers. (LAN buffers are used by a number of consumers other than
PEDRIVER, such as DECnet, TCP/IP, and LAT.) Packet loss because of
insufficient LAN station buffers is a symptom of either LAN adapter
congestion or the system's inability to reuse the existing buffers fast
enough.
|
(2) Excessive collisions
|
Indicates the number of unsuccessful attempts to transmit messages on
the adapter. This problem is often caused by:
- A LAN loading problem resulting from heavy traffic (70% to 80%
utilization) on the specific LAN segment.
- A component called a screamer. A
screamer is an adapter whose protocol does not adhere
to Ethernet or FDDI hardware protocols. A screamer does not wait for
permission to transmit packets on the adapter, thereby causing
collision errors to register in this field.
If a significant number of transmissions with multiple collisions
have occurred, then OpenVMS Cluster performance is degraded. You might
be able to improve performance either by removing some nodes from the
LAN segment or by adding another LAN segment to the cluster. The
overall goal is to reduce traffic on the existing LAN segment, thereby
making more bandwidth available to the OpenVMS Cluster system.
|
(3) Last receive and Last transmit
|
The difference in the times shown in the Last receive and Last transmit
message fields should not be large. Minimally, the timestamps in these
fields should reflect that HELLO datagram messages are being sent
across channels every 3 seconds. Large time differences might indicate:
- A hardware failure
- Whether or not the LAN driver sees the NISCA protocol as being
active on a specific LAN adapter
|
F.3.7 Monitoring PEDRIVER Buses for IP interfaces
The SDA command SHOW PORT/BUS=BUS_IP_interface command is useful for
displaying the PEDRIVER representation of an IP interface. To PEDRIVER,
a bus is the logical representation of the IP interface. (To list the
names and addresses of buses, enter the SDA command SHOW
PORT/ADDR=PE_PDT and then press the Return key twice.) The following
example shows a display for the IP interface named IE0. command.
Example F-6 SDA Command SHOW PORT/BUS
=BUS_IP_interface |
$ ANALYZE/SYSTEM
SDA> SHOW PORT/BUS=886C0010
VMScluster data structures
--------------------------
--- BUS: 886C0010 (IE0) Device: IP IP Address: 16.138.182.6 (1)
Status: 00004203 run,online,xmt_chaining_disabled (2)
------- Transmit ------ ------- Receive ------- ---- Structure Addresses ---
Msg Xmt 2345987277 (3) Msg Rcv 2452130165 (4) PORT Address 8850B9B8
Mcast Msgs 0 Mcast Msgs 0 VCIB Addr 886C02A0
Mcast Bytes 0 Mcast Bytes 0 HELLO Message Addr 886C02A0
Bytes Xmt 3055474713 Bytes Rcv 3545255112 BYE Message Addr 886C05CC
Outstand I/Os 0 Buffer Size 1394 Delete BUS Rtn Adr 90AA2EC8
Xmt Errors (5) 0 Rcv Ring Size 0
--- Receive Errors ---- ------ BUS Timer ------ ----- Datalink Events ------
TR Mcast Rcv 0 Handshake TMO 00000000 Last 22-SEP-2008 12:20:50.06
Rcv Bad SCSID 0 Listen TMO 00000000 Last Event 00004002
Rcv Short Msgs 0 HELLO timer 6 Port Usable 1
Fail CH Alloc 0 HELLO Xmt err 0 Port Unusable 0
Fail VC Alloc 0 Address Change 0
Wrong PORT 0 Port Restart Fail 0
|
Field |
Description |
(1) IP Address
|
Displays the IP address of the interface.
|
(2) Status
|
The Status line should always display a status of "online" to indicate
that PEDRIVER can access its IP interface.
|
(3) Msg Xmt (messages transmitted)
|
Shows the total number of packets transmitted over the virtual circuit
to the remote node. It provides the Multicast (mcast) and Multicast
bytes transmitted.
|
(4) Msg Rcv (messages received)
|
Shows the total number of packets received over the virtual circuit
from the remote node. It provides the Multicast (mcast) and Multicast
bytes transmitted.
|
(5) Xmt Errors (transmission errors)
|
Indicates the number of times PEDRIVER has been unable to transmit a
packet using this IP interface.
|
F.3.8 Monitoring PEDRIVER Channels for IP Interfaces
The SDA command SHOW PORT/Channel=Channel_IP_interface command is
useful for displaying the PEDRIVER representation of an IP interface.
To the PEDRIVER, a channel is the logical communication path between
two IP interfaces located on different nodes. (To list the names and
addresses of channels created, enter the SDA command SHOW SYMBOL CH_*
and then press the Return key.) The following example shows a display
for the IP interface named IE0.
Example F-7 SDA Command SHOW PORT/CHANNEL
Display |
$ ANALYZE/SYSTEM
SDA> show port/channel=CH_OOTY_IE0_WE0
VMScluster data structures
--------------------------
-- PEDRIVER Channel (CH:886C5A40) for Virtual Circuit (VC:88161A80) OOTY --
State: 0004 open Status: 6F path,open,xchndis,rmhwavld,tight,fast
ECS Status: Tight,Fast
BUS: 886BC010 (IE0) Lcl Device: IP Lcl IP Address: 16.138.182.6 1 (1)
Rmt BUS Name: WE0 Rmt Device: IP Rmt IP Address: 15.146.235.10 2 (2)
Rmt Seq #: 0004 Open: 4-OCT-2008 00:18:58.94 Close: 4-OCT-2008 00:18:24.53
- Transmit Counters --- - Receive Counters ---- - Channel Characteristics --
Bytes Xmt 745486312 Bytes Rcv 2638847244 Protocol Version 1.6.0
Msg Xmt 63803681 Msg Rcv 126279729 Supported Services 00000000
Ctrl Msgs 569 Ctrl Msgs 565 Local CH Sequence # 0003
Ctrl Bytes 63220 Ctrl Bytes 62804 Average RTT (usec) 5780.8
Mcast Msgs 106871 Buffer Size:
Mcast Bytes 11114584 Current 1394
- Errors --------------------------------------- Remote 1394
Listen TMO 2 Short CC Msgs 0 Local 1394
TR ReXmt 605 Incompat Chan 0 Negotiated 1394
DL Xmt Errors 0 No MSCP Srvr 0 Priority 0
CC HS TMO 0 Disk Not Srvd 0 Hops 2
Bad Authorize 0 Old Rmt Seq# 0 Load Class 100
Bad ECO 0 Rmt TR Rcv Cache Size 64
Bad Multicast 0 Rmt DL Rcv Buffers 8
Losses 0
- Miscellaneous ------- - Buf Size Probing----- - Delay Probing ------------
Prv Lstn Timer 5 SP Schd Timeout 6 DP Schd Timeouts 0
Next ECS Chan 886C5A40 SP Starts 1 DP Starts 0
SP Complete 1 DP Complete 0
- Management ---------- SP HS TMO 0 DP HS TMO 1
Mgt Priority 0 HS Remaining Retries 4
Mgt Hops 0 Last Probe Size 1395
Mgt Max Buf Siz 8110
|
Field |
Description |
(1) Lcl IP Address (Local IP Address)
|
Displays the IP address of the local interface.
|
(2) Rmt IP Address (Remote IP Address)
|
Displays the IP address of the remote interface.
|
F.4 Using SCACP to Monitor Cluster Communications
The SCA Control Program (SCACP) utility is designed to monitor and
manage cluster communications. It is derived from the Systems
Communications Architecture (SCA), which defines the communications
mechanisms that allow nodes in an OpenVMS Cluster system to cooperate.
SCA does the following:
- Governs the sharing of data between resources at the nodes.
- Binds together System Applications (SYSAPs) that run on different
OpenVMS Alpha and Integrity server systems.
To invoke SCACP, enter the following command at the DCL prompt:
SCACP displays the following prompt, at which you can enter SCACP
commands using the standard rules of DCL syntax:
For more information about SCACP, see HP OpenVMS System Management
Utilities Reference Manual.
F.5 Troubleshooting NISCA Communications
F.5.1 Areas of Trouble
Sections F.6 and F.7 describe two likely areas of
trouble for LAN networks: channel formation and retransmission. The
discussions of these two problems often include references to the use
of a LAN analyzer tool to isolate information in the NISCA protocol.
Reference: As you read about how to diagnose NISCA
problems, you may also find it helpful to refer to Section F.8, which
describes the NISCA protocol packet, and Section F.9, which describes
how to choose and use a LAN network failure analyzer.
F.6 Channel Formation
Channel-formation problems occur when two nodes cannot communicate
properly between LAN adapters.
F.6.1 How Channels Are Formed
Table F-7 provides a step-by-step description of channel formation.
Table F-7 Channel Formation
Step |
Action |
1
|
Channels are formed when a node sends a HELLO datagram from its LAN
adapter to a LAN adapter on another cluster node. If this is a new
remote LAN adapter address, or if the corresponding channel is closed,
the remote node receiving the HELLO datagram sends a CCSTART datagram
to the originating node after a delay of up to 2 seconds.
|
2
|
Upon receiving a CCSTART datagram, the originating node verifies the
cluster password and, if the password is correct, the node responds
with a VERF datagram and waits for up to 5 seconds for the remote node
to send a VACK datagram. (VERF, VACK, CCSTART, and HELLO datagrams are
described in Section F.8.5.)
|
3
|
Upon receiving a VERF datagram, the remote node verifies the cluster
password; if the password is correct, the node responds with a VACK
datagram and marks the channel as open. (See Figure F-3.)
|
4
|
WHEN the local node... |
THEN... |
Does not receive the VACK datagram within 5 seconds
|
The channel state goes back to closed and the handshake timeout counter
is incremented.
|
Receives the VACK datagram within 5 seconds and the cluster password is
correct
|
The channel is opened.
|
|
5
|
Once a channel has been formed, it is maintained (kept open) by the
regular multicast of HELLO datagram messages. Each node multicasts a
HELLO datagram message at least once every 3.0 seconds over each LAN
adapter. Either of the nodes sharing a channel closes the channel with
a listen timeout if it does not receive a HELLO datagram or a sequence
message from the other node within 8 to 9 seconds. If you receive a
"Port closed virtual circuit" message, it indicates a channel
was formed but there is a problem receiving traffic on time. When this
happens, look for HELLO datagram messages getting lost.
|
Figure F-3 shows a message exchange during a successful
channel-formation handshake.
Figure F-3 Channel-Formation Handshake
|