I've been looking for, but unable to find, a command/utility that functions kind of like psrinfo but for memory modules. I have Solaris 8 and Solaris 10 boxes. The output of prtdiag in Solaris 8 does not provide the status info for the CPUs and Memory Modules like it does in Solaris 10. Maybe Solaris 8 will show something that's faulty in the Fru Operational Status section but I don't know. Perhaps it's just been luck that the Solaris 8 boxes are not experiencing problems. But the Solaris 10 boxes have been having both CPU and Memory failures.
Host_Solaris8:> /usr/platform/`uname -i`/sbin/prtdiag | head -10
System Configuration: Sun Microsystems sun4u Netra 240
System clock frequency: 160 MHZ
Memory size: 4GB
==================================== CPUs ====================================
E$ CPU CPU Temperature Fan
CPU Freq Size Impl. Mask Die Ambient Speed Unit
--- -------- ---------- ------ ---- -------- -------- ----- ----
MB/P0 1280 MHz 1MB US-IIIi 2.4 - -
MB/P1 1280 MHz 1MB US-IIIi 2.4 - -
Host_Solaris10:> prtdiag | head -10
System Configuration: Sun Microsystems sun4u Netra 240
System clock frequency: 167 MHZ
Memory size: 4GB
==================================== CPUs ====================================
E$ CPU CPU
CPU Freq Size Implementation Mask Status Location
--- -------- ---------- --------------------- ----- ------ --------
0 1503 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 faulted MB/P0
1 1503 MHz 1MB SUNW,UltraSPARC-IIIi 3.4 on-line MB/P1
When I first encountered a failed CPU, I then found psrinfo which gives consistent output whether on Solaris 8 or 10.
Host:> psrinfo
0 faulted since 12/16/2008 12:45:49
1 on-line since 12/04/2008 02:14:35
Since then, I've encounter failed memory modules on a Solaris 10 box. But, as with the CPUs, prtdiag in Solaris 8 does not have a status field for the memory.
Host_Solaris8:> /usr/platform/`uname -i`/sbin/prtdiag | tail -18 | head -6
Memory Module Groups:
--------------------------------------------------
ControllerID GroupID Labels
--------------------------------------------------
1 0 MB/P1/B0/D0,MB/P1/B0/D1
1 1 MB/P1/B1/D0,MB/P1/B1/D1
Host_Solaris10:> prtdiag | tail -8
Memory Module Groups:
--------------------------------------------------
ControllerID GroupID Labels Status
--------------------------------------------------
0 0 MB/P0/B0/D0
0 0 MB/P0/B0/D1
1 0 MB/P1/B0/D0 failed
1 0 MB/P1/B0/D1 failed
If anyone knows of a consistent way, across Solaris8 and Solaris 10, to identify failed memory modules, I would love to hear it.