|
The online home of
the company bringing you quality HP hardware and software support |
| Using the Standalone Analysis Tool (SAT) |
|
All too often we get support call, or email about a system interruption with very scant details. A string of Bxxx DEAD hex codes or a system abort # with subsystems and status codes. The worst case is that of a system hang with no status codes at all. The severity of the hang may range from a running system that no one can access, to total eclipse, where even a ctrl-b is ignored. Generally the consensus is that I wont take the time for a DUMP until it happens again. Taking this stance is acceptable if a reasonable amount of time passes until the next event, but can come back to bite you if the problem starts to reoccur with a vengeance. Then at least some amount of data from the first event would have been very valuable. So the question is, what are the alternatives when:
The HP3000 platform has a powerful combination to address these issues. Namely, the remote console facility and the Standalone Analysis Tool (SAT). This article will focus on using remote console and SAT as the first line of defense diagnostic tools. For a more comprehensive article on how to handle unexpected system interruptions you are encouraged to review the article titled "Handling System Aborts and System Failures" at http://www.beechglen.com/mpe/technotes/sysaborts.html. The remote console facility has two major features: One is, as its name implies, it allows console control of the system remotely. And secondly, using a terminal emulator, it allows large amounts of data to be collected in a short amount of time. Please contact us if you need help on configuring or the remote console access on your system. The SAT offline utility allows us to look at the operating system and current process status to determine the nature of the interruption. It does not involve a tape mount like DUMP so it is very fast, and if we are prepared in advance with the working knowledge of the remote console facility, we dont have to manually write anything down. So lets look at a system abort using SAT: for each CPU we want to get a trace. In the following example we have a three CPU system so we need to get traces for all three processors, then exit. The commands to be executed are in red:
After a system abort, the console shows in inverse video:
With Status Codes of:
At this point, press CTRL-B (9x9 systems ensure that the key is in SERVICE, on 9x8 system verify the toggle switch on the back of the system is in the SERVICE postion.)
Wait for the system to reset. Interrupt the AUTOBOOT process if necessary by pressing any key within 10 seconds when prompted.
Gathering information from SAT in this manner will only take a few minutes and is surely worth the effort. Consider that taking a memory dump can take an hour or more and is largely dependent upon the amount of memory in the system, the number and type of disks in the system volume set that contain virtual memory, and the speed of the tape drive in which memory will be dumped. As you can see from the sample output above there is a large amount of information contained here to write down. Clearly connecting to the remote console (or telnet to the GSP on an A-class or N-class system) via a terminal emulator is the way to go. You can capture the entire screen contents and email them to us to record the event and for analysis. Or we can log on to the remote console ourselves and collect the data directly |
Send mail to
webmaster@beechglen.com with
questions or comments about this web site.
Copyright © 2006
Beechglen Development Inc.
Last modified:
01/20/06