Problems with AS4100 Rackmount ??
 
> Hi,
> 
> Our configuration is: two AlphaServer 4100 5/300, DU3.2G. These
> AlphaServers are installed on a rack. They both are configured with 2
> KZPSA´s, 1 KZPSC, 1 DE500, 1 Video Card, and 1 DNSES;  384 MB.
> Firmware
> upgraded with the 3.9 Firmware Upgrade CDROM + 3.8-6 SRM console
> upgrade.
> The O.S. disk (root, /usr, swap) on both nodes is a mirror made up
> from
> two RZ28D-VW disks and the KZPSC RAID controller
> 
> They´ve been working fine since 4 months ago.
> 
> Three weeks ago, we needed to install an additional card on one of the
> AS4100 ( the upper one ). The installation procedure was extremely
> careful, but suddenly the other server (which was running DU3.2G with
> 0
> users at that time) stopped and showed up the blue screen and the
> P00>>
> prompt. Partially recovered of the shock, we booted again the server.
> The event log says that everything is OK, with the exception that
> there
> has not been a shutdown event before the last startup event. 
> 
> The problem remained on the darkness (we thought that it had been due
> to
> a human error or something like that) until one week ago, when we
> needed
> to add a card to the second AS4100 (lower one). 
> 
> In the middle of the process (very, very, very meticulous), it
> happened
> exactly the same as three weeks ago, with the exception that the first
> we perceived was that in one of the O.S. members of the mirror (upper
> node) it began flashing the error led. 
> 
> With the node still up (and the other one still opened), we tried to
> use
> the swxcrmgr software for DU, but it generated a core file. We tried
> to
> run the program 4 or 5 times else when the machine went down to the
> blue
> screen and the P00>> prompt. The events log says the same as three
> weeks
> before: everything is fine. We changed the "bad disk", rebuilt the
> mirror and till now, everything is OK. 
> 
> There is no reason for a node to go down when one and only one of the
> members of the mirror goes down, i think. What happened ? no idea.
> 
> Now, we have two AS4100 which weirdly crashed down without an apparent
> cause. The PCM boards on both machines doesn´t report problems.
> 
> These crashes never gave us one single clue.Is there any previous
> similar experience wherever ? Is there an FCO ? Perhaps a tip to give
> maintenance to AS4100´s rackmounted ? Is recommended to turn off both
> Alphas for maintenance ?
> 
> Thanks for reading this long. For sure, i´ll summarize the responses,
> if
> any.
> 
> 
> Regards,
> UNIX Admin
Received on Thu Jun 26 1997 - 22:33:56 NZST
 
This archive was generated by hypermail 2.4.0
: Wed Nov 08 2023 - 11:53:36 NZDT