HP OpenVMS Systems

ask the wizard
Content starts here

System crash? (bugcheck from HW? or SW?)

» close window

The Question is:

When we are rebooting our system monthly, we are getting a strange
occurance.  The log is reading the following:
"cannot write system dump; initialization of controller failed."
Then we get....
"****Fatal BUG CHECK, version = V6.2  HALT, Halt instruction restart
Crash CPU:00   Primary CPU:00"
Then the system does a register dump and them a memory dump.  We are
completely stumped.  Our Compaq tech is also stumped.  We are wondering why
this is happening "all-of-a-sudden"?  Please help.  By the way, the memory
dump message is as follows:
"****Starting memory dump, writing dump to unit number 100
Header and error log buffers dumped...
SPT & GPT dumped...
System space dumped...
Global pages dumped...
MMMGR dumped...
IPCACP dumped...
OPCOM dumped...
_LTA5261:  dumped...
JOB_CONTROL dumped...
LATACP dumped...
Thumper dumped...
CLEOIO dumped...
RPC$SWL dumped...
CONFIGURE dumped...
***Memory dump complete, dump written to PAGEFILE.SYS, to unit number 100"
At this point the system is ready to halt.
Please let us know.  We are desperate.  By the way, I love your section of
the site.  It has helped me out alot with our VAX!!!  Thank you..

The Answer is :

  This problem could potentially be caused by a hardware problem, or by
  an OpenVMS or site-specific software problem.
  Details of the particular VAX, the particular dumpfile configuration,
  the storage I/O configuration (device type(s), interconnect(s), RAID,
  shadowing use and shadowing type, HSx controller usage, MSCP-served
  devices used, etc), DOSD (dump off the system disk), would all be very
  useful information in resolving this.  (In other words, rather more
  detail on the hardware and software configuration is needed...)
  The OpenVMS Wizard would suggest disabling the halt-restart setting
  at the system console, setting the system to halt or to reboot, but
  not to attempt to restart.  Though depending on the particular VAX,
  this mechanism is often selected via the following console command:
    SET HALT code
  The OpenVMS Wizard would also recommend creating and configuring a
  system dump file while debugging this sort of problem, rather than
  depending on the system pagefile.
  The OpenVMS Wizard would look for recent changes to kernel-mode code,
  and at the processes that are active at the time of the problem.
  (This can include MMGR, IPCACP, OPCOM, _LTA5261:, etc.)
  The OpenVMS Wizard would also recommend acquiring and applying the
  available mandatory ECO kits, particularly for core device drivers,
  system memory management, and LAT.

answer written or last revised on ( 25-FEB-2000 )

» close window