problem in finding a hardware problem

Hi

I am right now facing a strange hardware problem. System get booted with the following error:

Fatal Error Reset
CPU 0000.0000.0000.0003 AFSR 0100.0000.0000.0000 SCE
AFAR 0000.07c6.0000.1000
SC Alert: Host System has Reset

It happen 4 or 5 times and get the same error every time.I also try to give high load on the server and also find out that processes are properly switching between all the CPUs.It seems to me a hardware problem. Then I run a system hardware diagnostic also, but didnot find any error also as POST result get passed and its output is

@(#)OBP 4.13.0 2004/01/19 18:28 Sun Fire V440,Netra 440
Clearing TLBs
Power-On Reset
Executing Power On SelfTest
0>
0>@(#) Sun Fire[TM] V440,Netra[TM] 440 POST 4.13.0 2004/01/16 12:35
/dat/fw/common-source/firmware_re/post/post-build-4.13.0/Fiesta/chalupa/integrated (firmware_re)
0>Copyright � 2004 Sun Microsystems, Inc. All rights reserved
SUN PROPRIETARY/CONFIDENTIAL.
Use is subject to license terms.
0>Hard Powerup RST thru SW
0>OBP->POST Call with %o0=00000000.01014000.
0>Diag level set to MAX.
0>Verbosity level set to 0.
0>MFG scrpt mode set NORM
0>I/O port set to TTYA.
0>Start Selftest.....
0>CPUs present in system: 0 1 2 3
0>Test CPU(s).....
0>Init SB
0>Initialize I2C Controller
0>L2 Cache Tags Test
0>Init CPU
0>DMMU
0>DMMU TLB DATA RAM Access
0>DMMU TLB TAGS Access
0>IMMU Registers Access
0>IMMU TLB DATA RAM Access
0>IMMU TLB TAGS Access
0>Init mmu regs
0>Setup L2 Cache
0>L2 Cache Control = 00000000.00f04400
0> Size = 00000000.00100000...
0>Scrub and Setup L2 Cache
0>Setup and Enable DMMU
0>Setup DMMU Miss Handler
0>Test Mailbox
0>Scrub Mailbox
0>CPU Tick and Tick Compare Registers Test
0>CPU Stick and Stick Compare Registers Test
0>Set Timing
0> UltraSPARC[TM] IIIi, Version 2.4
1>L2 Cache Tags Test
2>L2 Cache Tags Test
3>L2 Cache Tags Test
1>Init CPU
2>Init CPU
3>Init CPU
1> UltraSPARC[TM] IIIi, Version 2.4
2> UltraSPARC[TM] IIIi, Version 2.4
3> UltraSPARC[TM] IIIi, Version 2.4
1>DMMU
2>DMMU
3>DMMU
1>DMMU TLB DATA RAM Access
2>DMMU TLB DATA RAM Access
3>DMMU TLB DATA RAM Access
1>DMMU TLB TAGS Access
2>DMMU TLB TAGS Access
3>DMMU TLB TAGS Access
1>IMMU Registers Access
2>IMMU Registers Access
3>IMMU Registers Access
1>IMMU TLB DATA RAM Access
2>IMMU TLB DATA RAM Access
3>IMMU TLB DATA RAM Access
1>IMMU TLB TAGS Access
2>IMMU TLB TAGS Access
3>IMMU TLB TAGS Access
1>Init mmu regs
2>Init mmu regs
3>Init mmu regs
1>Setup L2 Cache
1>L2 Cache Control = 00000000.00f04400
1> Size = 00000000.00100000...
2>Setup L2 Cache
2>L2 Cache Control = 00000000.00f04400
2> Size = 00000000.00100000...
3>Setup L2 Cache
3>L2 Cache Control = 00000000.00f04400
3> Size = 00000000.00100000...
1>Scrub and Setup L2 Cache
2>Scrub and Setup L2 Cache
3>Scrub and Setup L2 Cache
1>Setup and Enable DMMU
2>Setup and Enable DMMU
3>Setup and Enable DMMU
1>Setup DMMU Miss Handler
2>Setup DMMU Miss Handler
3>Setup DMMU Miss Handler
1>Test Mailbox
2>Test Mailbox
3>Test Mailbox
1>Scrub Mailbox
2>Scrub Mailbox
3>Scrub Mailbox
1>CPU Tick and Tick Compare Registers Test
2>CPU Tick and Tick Compare Registers Test
3>CPU Tick and Tick Compare Registers Test
1>CPU Stick and Stick Compare Registers Test
2>CPU Stick and Stick Compare Registers Test
3>CPU Stick and Stick Compare Registers Test
1>Setup Int Handlers
2>Setup Int Handlers
0>Interrupt Crosscall.....
3>Setup Int Handlers
0>Setup Int Handlers
0>Send Int CPU 1
0>Send Int CPU 2
0>Send Int CPU 3
1>Send Int to Master CPU
2>Send Int to Master CPU
3>Send Int to Master CPU
0>MB: Part-Dash-Rev#: 5016344-09-50 Serial#: 052808
0>CPU0: Part-Dash-Rev#: 5016370-04-51 Serial#: 063736
0>CPU1: Part-Dash-Rev#: 5016370-04-51 Serial#: 036855
0>CPU2: Part-Dash-Rev#: 5016370-04-51 Serial#: 026118
0>CPU3: Part-Dash-Rev#: 5016370-04-51 Serial#: 026276
0>CPU0 DIMM B0/D0 J0601:
0>Part#: M3 12L2828ET0-CA2 Serial#: 03097afe Date Code: 0425 Rev#: 3045
0>CPU0 DIMM B0/D1 J0602:
0>Part#: M3 12L2828ET0-CA2 Serial#: 030c7ad1 Date Code: 0425 Rev#: 3045
0>CPU0 DIMM B1/D0 J0701:
0>Part#: M3 12L2828ET0-CA2 Serial#: 030d7b02 Date Code: 0425 Rev#: 3045
0>CPU0 DIMM B1/D1 J0702:
0>Part#: M3 12L2828ET0-CA2 Serial#: 030f7ad4 Date Code: 0425 Rev#: 3045
0>CPU1 DIMM B0/D0 J0601:
0>Part#: 72D128521GR7B Serial#: 021f4814 Date Code: 0427 Rev#: 020e
0>CPU1 DIMM B0/D1 J0602:
0>Part#: 72D128521GR7B Serial#: 040e4c24 Date Code: 0424 Rev#: 020e
0>CPU1 DIMM B1/D0 J0701:
0>Part#: M3 12L2828ET0-CA2 Serial#: 0305a7be Date Code: 0451 Rev#: 3045
0>CPU1 DIMM B1/D1 J0702:
0>Part#: M3 12L2828ET0-CA2 Serial#: 030b8849 Date Code: 0506 Rev#: 3045
0>CPU2 DIMM B0/D0 J0601:
0>Part#: 72D128521GR7B Serial#: 02036716 Date Code: 0424 Rev#: 020e
0>CPU2 DIMM B0/D1 J0602:
0>Part#: 72D128521GR7B Serial#: 021f4816 Date Code: 0427 Rev#: 020e
0>CPU2 DIMM B1/D0 J0701:
0>Part#: 36VDDT12872G-26AC0 Serial#: 1b53f301 Date Code: 040c Rev#: 0000
0>CPU2 DIMM B1/D1 J0702:
0>Part#: 36VDDT12872G-26AC0 Serial#: 1b53f304 Date Code: 040c Rev#: 0000
0>CPU3 DIMM B0/D0 J0601:
0>Part#: 36VDDT12872G-26AC0 Serial#: 1b53f2fb Date Code: 040c Rev#: 0000
0>CPU3 DIMM B0/D1 J0602:
0>Part#: 36VDDT12872G-26AC0 Serial#: 1b53f2f6 Date Code: 040c Rev#: 0000
0>CPU3 DIMM B1/D0 J0701:
0>Part#: M3 12L2828ET0-CA2 Serial#: 0310a66a Date Code: 0451 Rev#: 3045
0>CPU3 DIMM B1/D1 J0702:
0>Part#: M3 12L2828ET0-CA2 Serial#: 030e89be Date Code: 0506 Rev#: 3045
0>Set CPU/System Speed
0>........
0>Send MC Timing CPU 1
0>Send MC Timing CPU 2
0>Send MC Timing CPU 3
0>Init Memory.....
0>Probe Dimms
1>Probe Dimms
2>Probe Dimms
3>Probe Dimms
1>Init Mem Controller Regs
2>Init Mem Controller Regs
3>Init Mem Controller Regs
0>Init Mem Controller Regs
1>Set JBUS config reg
2>Set JBUS config reg
3>Set JBUS config reg
0>Set JBUS config reg
0>IO-Bridge unit 0 init test
0>IO-Bridge unit 1 init test
0>Do PLL reset
0>Setting timing to 7:1 10:1, system frequency 183 MHz, CPU frequency 1281 MHz

SC Alert: Host System has Reset
0>Soft Power-on RST thru SW
0>PLL Reset.....
0>Init SB
0>Initialize I2C Controller
0>Init CPU
0>Init mmu regs
0>Setup L2 Cache
0>L2 Cache Control = 00000000.00f04400
0> Size = 00000000.00100000...
0>Setup and Enable DMMU
0>Setup DMMU Miss Handler
0>Scrub Mailbox
0>Timing is 7:1 10:1, sys 183 MHz, CPU 1281 MHz, mem 128 MHz.
0> UltraSPARC[TM] IIIi, Version 2.4
1>Init CPU
2>Init CPU
3>Init CPU
1> UltraSPARC[TM] IIIi, Version 2.4
2> UltraSPARC[TM] IIIi, Version 2.4
3> UltraSPARC[TM] IIIi, Version 2.4
1>Init mmu regs
2>Init mmu regs
3>Init mmu regs
1>Setup L2 Cache
1>L2 Cache Control = 00000000.00f04400
1> Size = 00000000.00100000...
2>Setup L2 Cache
2>L2 Cache Control = 00000000.00f04400
2> Size = 00000000.00100000...
3>Setup L2 Cache
3>L2 Cache Control = 00000000.00f04400
3> Size = 00000000.00100000...
1>Setup and Enable DMMU
2>Setup and Enable DMMU
3>Setup and Enable DMMU
1>Setup DMMU Miss Handler
2>Setup DMMU Miss Handler
3>Setup DMMU Miss Handler
1>Scrub Mailbox
2>Scrub Mailbox
3>Scrub Mailbox
1>Timing is 7:1 10:1, sys 183 MHz, CPU 1281 MHz, mem 128 MHz.
2>Timing is 7:1 10:1, sys 183 MHz, CPU 1281 MHz, mem 128 MHz.
3>Timing is 7:1 10:1, sys 183 MHz, CPU 1281 MHz, mem 128 MHz.
0>Init Memory.....
0>Probe Dimms
1>Probe Dimms
2>Probe Dimms
3>Probe Dimms
1>Init Mem Controller Sequence
2>Init Mem Controller Sequence
3>Init Mem Controller Sequence
0>Init Mem Controller Sequence
0>IO-Bridge unit 0 init test
0>IO-Bridge unit 1 init test
0>Test Memory.....
0>Select Bank Config
0>Probe and Setup Memory
0>INFO: 1024MB Bank 0, Dimm Type X4
0>INFO: 1024MB Bank 1, Dimm Type X4
0>INFO: 1024MB Bank 2, Dimm Type X4
0>INFO: 1024MB Bank 3, Dimm Type X4
0>
0>Data Bitwalk on Master
0> Test Bank 0.
0> Test Bank 1.
0> Test Bank 2.
0> Test Bank 3.
0>Address Bitwalk on Master
0>Addr walk mem test on CPU 0 Bank 0: 00000000.00000000 to 00000000.40000000.
0>Addr walk mem test on CPU 0 Bank 1: 00000001.00000000 to 00000001.40000000.
0>Addr walk mem test on CPU 0 Bank 2: 00000002.00000000 to 00000002.40000000.
0>Addr walk mem test on CPU 0 Bank 3: 00000003.00000000 to 00000003.40000000.
0>Set Mailbox
0>Final mc1 is f0000026.3e781c4e.
0>Setup Final DMMU Entries
0>Post Image Region Scrub
0>Run POST from Memory
1>Waiting for master CPU=0, timeout in 134 seconds...
2>Waiting for master CPU=0, timeout in 134 seconds...
3>Waiting for master CPU=0, timeout in 134 seconds...
0>Verifying checksum on copied image.
0>The Memory's CHECKSUM value is aa23.
0>The Memory's Content Size value is 80061.
0>Success... Checksum on Memory Validated.
1>Select Bank Config
2>Select Bank Config
3>Select Bank Config
1>Probe and Setup Memory
1>INFO: 1024MB Bank 0, Dimm Type X4
1>INFO: 1024MB Bank 1, Dimm Type X4
1>INFO: 1024MB Bank 2, Dimm Type X4
1>INFO: 1024MB Bank 3, Dimm Type X4
1>
2>Probe and Setup Memory
2>INFO: 1024MB Bank 0, Dimm Type X4
2>INFO: 1024MB Bank 1, Dimm Type X4
2>INFO: 1024MB Bank 2, Dimm Type X4
2>INFO: 1024MB Bank 3, Dimm Type X4
2>
3>Probe and Setup Memory
3>INFO: 1024MB Bank 0, Dimm Type X4
3>INFO: 1024MB Bank 1, Dimm Type X4
3>INFO: 1024MB Bank 2, Dimm Type X4
3>INFO: 1024MB Bank 3, Dimm Type X4
3>
1>Set Mailbox
2>Set Mailbox
3>Set Mailbox
1>Final mc1 is f0000026.3e781c4e.
2>Final mc1 is f0000026.3e781c4e.
3>Final mc1 is f0000026.3e781c4e.
0>Data Bitwalk on Slave 1
0> Test Bank 0.
0> Test Bank 1.
0> Test Bank 2.
0> Test Bank 3.
0>Data Bitwalk on Slave 2
0> Test Bank 0.
0> Test Bank 1.
0> Test Bank 2.
0> Test Bank 3.
0>Data Bitwalk on Slave 3
0> Test Bank 0.
0> Test Bank 1.
0> Test Bank 2.
0> Test Bank 3.
0>Address Bitwalk on Slave 1
0>Addr walk mem test on CPU 1 Bank 0: 00000010.00000000 to 00000010.40000000.
0>Addr walk mem test on CPU 1 Bank 1: 00000011.00000000 to 00000011.40000000.
0>Addr walk mem test on CPU 1 Bank 2: 00000012.00000000 to 00000012.40000000.
0>Addr walk mem test on CPU 1 Bank 3: 00000013.00000000 to 00000013.40000000.
0>Address Bitwalk on Slave 2
0>Addr walk mem test on CPU 2 Bank 0: 00000020.00000000 to 00000020.40000000.
0>Addr walk mem test on CPU 2 Bank 1: 00000021.00000000 to 00000021.40000000.
0>Addr walk mem test on CPU 2 Bank 2: 00000022.00000000 to 00000022.40000000.
0>Addr walk mem test on CPU 2 Bank 3: 00000023.00000000 to 00000023.40000000.
0>Address Bitwalk on Slave 3
0>Addr walk mem test on CPU 3 Bank 0: 00000030.00000000 to 00000030.40000000.
0>Addr walk mem test on CPU 3 Bank 1: 00000031.00000000 to 00000031.40000000.
0>Addr walk mem test on CPU 3 Bank 2: 00000032.00000000 to 00000032.40000000.
0>Addr walk mem test on CPU 3 Bank 3: 00000033.00000000 to 00000033.40000000.
1>Setup Final DMMU Entries
2>Setup Final DMMU Entries
3>Setup Final DMMU Entries
1>Map Slave POST to master memory
2>Map Slave POST to master memory
3>Map Slave POST to master memory
1>I-Cache RAM Test
2>I-Cache RAM Test
3>I-Cache RAM Test
0>Test CPU Caches.....
1>I-Cache Tag RAM
2>I-Cache Tag RAM
3>I-Cache Tag RAM
0>I-Cache RAM Test
1>I-Cache Valid/Predict TAGS Test
2>I-Cache Valid/Predict TAGS Test
3>I-Cache Valid/Predict TAGS Test
0>I-Cache Tag RAM
1>I-Cache Snoop Tag Field
2>I-Cache Snoop Tag Field
3>I-Cache Snoop Tag Field
0>I-Cache Valid/Predict TAGS Test
1>I-Cache Branch Predict Array Test
2>I-Cache Branch Predict Array Test
3>I-Cache Branch Predict Array Test
0>I-Cache Snoop Tag Field
1>Branch Prediction Initialization
2>Branch Prediction Initialization
3>Branch Prediction Initialization
0>I-Cache Branch Predict Array Test
1>D-Cache RAM
2>D-Cache RAM
3>D-Cache RAM
0>Branch Prediction Initialization
1>D-Cache Tags
2>D-Cache Tags
3>D-Cache Tags
0>D-Cache RAM
1>D-Cache Micro Tags
2>D-Cache Micro Tags
3>D-Cache Micro Tags
0>D-Cache Tags
1>D-Cache SnoopTags Test
2>D-Cache SnoopTags Test
3>D-Cache SnoopTags Test
0>D-Cache Micro Tags
1>W-Cache RAM
2>W-Cache RAM
3>W-Cache RAM
0>D-Cache SnoopTags Test
1>W-Cache Tags
2>W-Cache Tags
3>W-Cache Tags
0>W-Cache RAM
1>W-Cache Valid bit Test
2>W-Cache Valid bit Test
3>W-Cache Valid bit Test
0>W-Cache Tags
1>W-Cache Bank valid bit Test
2>W-Cache Bank valid bit Test
3>W-Cache Bank valid bit Test
0>W-Cache Valid bit Test
1>W-Cache SnoopTAGS Test
2>W-Cache SnoopTAGS Test
3>W-Cache SnoopTAGS Test
0>W-Cache Bank valid bit Test
1>P-Cache RAM
2>P-Cache RAM
3>P-Cache RAM
0>W-Cache SnoopTAGS Test
1>P-Cache Tags
2>P-Cache Tags
3>P-Cache Tags
0>P-Cache RAM
1>P-Cache SnoopTags Test
2>P-Cache SnoopTags Test
3>P-Cache SnoopTags Test
0>P-Cache Tags
1>P-Cache Status Data Test
2>P-Cache Status Data Test
3>P-Cache Status Data Test
0>P-Cache SnoopTags Test
1>8k DMMU TLB 0 Data
2>8k DMMU TLB 0 Data
3>8k DMMU TLB 0 Data
0>P-Cache Status Data Test
1>8k DMMU TLB 1 Data
2>8k DMMU TLB 1 Data
3>8k DMMU TLB 1 Data
0>8k DMMU TLB 0 Data
1>8k DMMU TLB 0 Tags
2>8k DMMU TLB 0 Tags
3>8k DMMU TLB 0 Tags
0>8k DMMU TLB 1 Data
1>8k DMMU TLB 1 Tags
2>8k DMMU TLB 1 Tags
3>8k DMMU TLB 1 Tags
0>8k DMMU TLB 0 Tags
1>8k IMMU TLB Data
2>8k IMMU TLB Data
3>8k IMMU TLB Data
0>8k DMMU TLB 1 Tags
1>8k IMMU TLB Tags
2>8k IMMU TLB Tags
3>8k IMMU TLB Tags
0>8k IMMU TLB Data
0>8k IMMU TLB Tags
1>FPU Registers and Data Path
2>FPU Registers and Data Path
3>FPU Registers and Data Path
0>FPU Registers and Data Path
1>FPU Move Registers
2>FPU Move Registers
3>FPU Move Registers
0>FPU Move Registers
1>FSR Read/Write
2>FSR Read/Write
3>FSR Read/Write
0>FSR Read/Write
1>FPU Block Register Test
2>FPU Block Register Test
3>FPU Block Register Test
0>FPU Block Register Test
1>FPU Branch Instructions
2>FPU Branch Instructions
3>FPU Branch Instructions
0>FPU Branch Instructions
1>FPU Functional Test
2>FPU Functional Test
3>FPU Functional Test
0>FPU Functional Test
1>Scrub Memory
2>Scrub Memory
3>Scrub Memory
0>Scrub Memory
1>Flush Caches
2>Flush Caches
3>Flush Caches
0>Flush Caches
1>L2-Cache Functional
2>L2-Cache Functional
3>L2-Cache Functional
0>Functional CPU Tests.....
1>L2-Cache Stress
2>L2-Cache Stress
3>L2-Cache Stress
0>L2-Cache Functional
1>IMMU Functional
2>IMMU Functional
3>IMMU Functional
0>L2-Cache Stress
1>DMMU Functional
2>DMMU Functional
3>DMMU Functional
0>IMMU Functional
1>I-Cache Functional
2>I-Cache Functional
3>I-Cache Functional
0>DMMU Functional
1>I-Cache Parity Functional
2>I-Cache Parity Functional
3>I-Cache Parity Functional
0>I-Cache Functional
0>I-Cache Parity Functional
1>I-Cache Parity Tag
2>I-Cache Parity Tag
3>I-Cache Parity Tag
0>I-Cache Parity Tag
1>I-Cache Snoop Parity Tag
2>I-Cache Snoop Parity Tag
3>I-Cache Snoop Parity Tag
0>I-Cache Snoop Parity Tag
1>D-Cache Functional
2>D-Cache Functional
3>D-Cache Functional
1>D-Cache Parity Functional
0>D-Cache Functional
2>D-Cache Parity Functional
3>D-Cache Parity Functional
1>D-Cache Parity Tag Test
0>D-Cache Parity Functional
2>D-Cache Parity Tag Test
3>D-Cache Parity Tag Test
1>W-Cache Functional
0>D-Cache Parity Tag Test
2>W-Cache Functional
3>W-Cache Functional
1>Graphics Functional
0>W-Cache Functional
1>CPU Superscalar Dispatch
2>Graphics Functional
3>Graphics Functional
2>CPU Superscalar Dispatch
3>CPU Superscalar Dispatch
0>Graphics Functional
1>SPARC Atomic Instruction Test
2>SPARC Atomic Instruction Test
3>SPARC Atomic Instruction Test
0>CPU Superscalar Dispatch
1>Non SPARC Atomic Instruction Test
2>Non SPARC Atomic Instruction Test
3>Non SPARC Atomic Instruction Test
0>SPARC Atomic Instruction Test
1>SOFTINT Register and Interrupt Test
2>SOFTINT Register and Interrupt Test
3>SOFTINT Register and Interrupt Test
0>Non SPARC Atomic Instruction Test
1>Branch Memory Test
2>Branch Memory Test
3>Branch Memory Test
0>SOFTINT Register and Interrupt Test
1>Fast ECC test
2>Fast ECC test
3>Fast ECC test
0>Branch Memory Test
1>System ECC test
2>System ECC test
3>System ECC test
0>Fast ECC test
0>System ECC test
0>XBus SRAM
0>IO-Bridge SouthBridge Remap Devs
0>IO-Bridge Tests.....
0>JBUS quick check
0> to IO-bridge_0
0> to IO-bridge_1
0>IO-Bridge unit 0 sram test
0>IO-Bridge unit 0 reg test
0>IO-Bridge unit 0 mem test
0>IO-Bridge unit 0 PCI id test
0>IO-Bridge unit 0 interrupt test
0>IO-Bridge unit 1 sram test
0>IO-Bridge unit 1 reg test
0>IO-Bridge unit 1 mem test
0>IO-Bridge unit 1 PCI id test
0>IO-Bridge unit 1 interrupt test
0>IO-Bridge unit 0 init test
1>IO-Bridge unit 0 sram test
1>IO-Bridge unit 0 reg test
1>IO-Bridge unit 0 mem test
1>IO-Bridge unit 0 PCI id test
1>IO-Bridge unit 0 interrupt test
1>IO-Bridge unit 1 init test
1>IO-Bridge unit 1 sram test
1>IO-Bridge unit 1 reg test
1>IO-Bridge unit 1 mem test
1>IO-Bridge unit 1 PCI id test
1>IO-Bridge unit 1 interrupt test
1>IO-Bridge unit 0 init test
2>IO-Bridge unit 0 sram test
2>IO-Bridge unit 0 reg test
2>IO-Bridge unit 0 mem test
2>IO-Bridge unit 0 PCI id test
2>IO-Bridge unit 0 interrupt test
2>IO-Bridge unit 1 init test
2>IO-Bridge unit 1 sram test
2>IO-Bridge unit 1 reg test
2>IO-Bridge unit 1 mem test
2>IO-Bridge unit 1 PCI id test
2>IO-Bridge unit 1 interrupt test
2>IO-Bridge unit 0 init test
3>IO-Bridge unit 0 sram test
3>IO-Bridge unit 0 reg test
3>IO-Bridge unit 0 mem test
3>IO-Bridge unit 0 PCI id test
3>IO-Bridge unit 0 interrupt test
3>IO-Bridge unit 1 init test
3>IO-Bridge unit 1 sram test
3>IO-Bridge unit 1 reg test
3>IO-Bridge unit 1 mem test
3>IO-Bridge unit 1 PCI id test
3>IO-Bridge unit 1 interrupt test
3>Print Mem Config
1>Caches : Icache is ON, Dcache is ON, Wcache is ON, Pcache is ON.
1>Memory interleave set to 0
1> Bank 0 1024MB : 00000010.00000000 -> 00000010.40000000.
1> Bank 1 1024MB : 00000011.00000000 -> 00000011.40000000.
1> Bank 2 1024MB : 00000012.00000000 -> 00000012.40000000.
1> Bank 3 1024MB : 00000013.00000000 -> 00000013.40000000.
2>Print Mem Config
2>Caches : Icache is ON, Dcache is ON, Wcache is ON, Pcache is ON.
2>Memory interleave set to 0
2> Bank 0 1024MB : 00000020.00000000 -> 00000020.40000000.
2> Bank 1 1024MB : 00000021.00000000 -> 00000021.40000000.
2> Bank 2 1024MB : 00000022.00000000 -> 00000022.40000000.
2> Bank 3 1024MB : 00000023.00000000 -> 00000023.40000000.
3>Print Mem Config
3>Caches : Icache is ON, Dcache is ON, Wcache is ON, Pcache is ON.
3>Memory interleave set to 0
3> Bank 0 1024MB : 00000030.00000000 -> 00000030.40000000.
3> Bank 1 1024MB : 00000031.00000000 -> 00000031.40000000.
3> Bank 2 1024MB : 00000032.00000000 -> 00000032.40000000.
3> Bank 3 1024MB : 00000033.00000000 -> 00000033.40000000.
0>Print Mem Config
0>Caches : Icache is ON, Dcache is ON, Wcache is ON, Pcache is ON.
0>Memory interleave set to 0
0> Bank 0 1024MB : 00000000.00000000 -> 00000000.40000000.
0> Bank 1 1024MB : 00000001.00000000 -> 00000001.40000000.
0> Bank 2 1024MB : 00000002.00000000 -> 00000002.40000000.
0> Bank 3 1024MB : 00000003.00000000 -> 00000003.40000000.
1>Block Memory
2>Block Memory
3>Block Memory
0>Block Memory
1>Test 1073741824 bytes on bank 0....
2>Test 1073741824 bytes on bank 0....
3>Test 1073741824 bytes on bank 0....
0>Test 1067450368 bytes on bank 0....
0>0% Done...
0>2% Done...
0>3% Done...
0>4% Done...
0>6% Done...
0>7% Done...
0>9% Done...
0>10% Done...
0>11% Done...
0>13% Done...
0>14% Done...
0>16% Done...
0>17% Done...
0>18% Done...
0>20% Done...
0>21% Done...
0>22% Done...
0>24% Done...
0>25% Done...
0>27% Done...
0>28% Done...
0>29% Done...
0>31% Done...
0>32% Done...
0>34% Done...
0>35% Done...
0>36% Done...
0>38% Done...
0>39% Done...
0>41% Done...
0>42% Done...
0>43% Done...
0>45% Done...
0>46% Done...
0>48% Done...
0>49% Done...
0>50% Done...
0>52% Done...
0>53% Done...
0>55% Done...
0>56% Done...
0>57% Done...
0>59% Done...
0>60% Done...
0>62% Done...
0>63% Done...
0>64% Done...
0>66% Done...
0>67% Done...
0>69% Done...
0>70% Done...
0>71% Done...
0>73% Done...
0>74% Done...
0>76% Done...
0>77% Done...
0>78% Done...
0>80% Done...
0>81% Done...
0>83% Done...
0>84% Done...
0>85% Done...
0>87% Done...
0>88% Done...
0>90% Done...
0>91% Done...
1>Test 1073741824 bytes on bank 1....
2>Test 1073741824 bytes on bank 1....
3>Test 1073741824 bytes on bank 1....
0>92% Done...
0>94% Done...
0>95% Done...
0>97% Done...
0>98% Done...
0>99% Done...
0>Test 1073741824 bytes on bank 1....
0>0% Done...
0>2% Done...
0>3% Done...
0>4% Done...
0>6% Done...
0>7% Done...
0>9% Done...
0>10% Done...
0>11% Done...
0>13% Done...
0>14% Done...
0>15% Done...
0>17% Done...
0>18% Done...
0>20% Done...
0>21% Done...
0>22% Done...
0>24% Done...
0>25% Done...
0>27% Done...
0>28% Done...
0>29% Done...
0>31% Done...
0>32% Done...
0>34% Done...
0>35% Done...
0>36% Done...
0>38% Done...
0>39% Done...
0>40% Done...
0>42% Done...
0>43% Done...
0>45% Done...
0>46% Done...
0>47% Done...
0>49% Done...
0>50% Done...
0>52% Done...
0>53% Done...
0>54% Done...
0>56% Done...
0>57% Done...
0>59% Done...
0>60% Done...
0>61% Done...
0>63% Done...
0>64% Done...
0>65% Done...
0>67% Done...
0>68% Done...
0>70% Done...
0>71% Done...
0>72% Done...
0>74% Done...
0>75% Done...
0>77% Done...
0>78% Done...
0>79% Done...
0>81% Done...
1>Test 1073741824 bytes on bank 2....
2>Test 1073741824 bytes on bank 2....
3>Test 1073741824 bytes on bank 2....
0>82% Done...
0>84% Done...
0>85% Done...
0>86% Done...
0>88% Done...
0>89% Done...
0>90% Done...
0>92% Done...
0>93% Done...
0>95% Done...
0>96% Done...
0>97% Done...
0>99% Done...
0>Test 1073741824 bytes on bank 2....
0>0% Done...
0>2% Done...
0>3% Done...
0>4% Done...
0>6% Done...
0>7% Done...
0>9% Done...
0>10% Done...
0>11% Done...
0>13% Done...
0>14% Done...
0>15% Done...
0>17% Done...
0>18% Done...
0>20% Done...
0>21% Done...
0>22% Done...
0>24% Done...
0>25% Done...
0>27% Done...
0>28% Done...
0>29% Done...
0>31% Done...
0>32% Done...
0>34% Done...
0>35% Done...
0>36% Done...
0>38% Done...
0>39% Done...
0>40% Done...
0>42% Done...
0>43% Done...
0>45% Done...
0>46% Done...
0>47% Done...
0>49% Done...
0>50% Done...
0>52% Done...
0>53% Done...
0>54% Done...
0>56% Done...
0>57% Done...
0>59% Done...
0>60% Done...
0>61% Done...
0>63% Done...
0>64% Done...
0>65% Done...
0>67% Done...
0>68% Done...
0>70% Done...
0>71% Done...
0>72% Done...
1>Test 1073741824 bytes on bank 3....
2>Test 1073741824 bytes on bank 3....
3>Test 1073741824 bytes on bank 3....
0>74% Done...
0>75% Done...
0>77% Done...
0>78% Done...
0>79% Done...
0>81% Done...
0>82% Done...
0>84% Done...
0>85% Done...
0>86% Done...
0>88% Done...
0>89% Done...
0>90% Done...
0>92% Done...
0>93% Done...
0>95% Done...
0>96% Done...
0>97% Done...
0>99% Done...
0>Test 1073741824 bytes on bank 3....
0>0% Done...
0>2% Done...
0>3% Done...
0>4% Done...
0>6% Done...
0>7% Done...
0>9% Done...
0>10% Done...
0>11% Done...
0>13% Done...
0>14% Done...
0>15% Done...
0>17% Done...
0>18% Done...
0>20% Done...
0>21% Done...
0>22% Done...
0>24% Done...
0>25% Done...
0>27% Done...
0>28% Done...
0>29% Done...
0>31% Done...
0>32% Done...
0>34% Done...
0>35% Done...
0>36% Done...
0>38% Done...
0>39% Done...
0>40% Done...
0>42% Done...
0>43% Done...
0>45% Done...
0>46% Done...
0>47% Done...
0>49% Done...
0>50% Done...
0>52% Done...
0>53% Done...
0>54% Done...
0>56% Done...
0>57% Done...
0>59% Done...
0>60% Done...
0>61% Done...
0>63% Done...
0>64% Done...
0>65% Done...
0>67% Done...
0>68% Done...
0>70% Done...
0>71% Done...
0>72% Done...
0>74% Done...
0>75% Done...
0>77% Done...
0>78% Done...
0>79% Done...
0>81% Done...
0>82% Done...
0>84% Done...
0>85% Done...
0>86% Done...
0>88% Done...
0>89% Done...
0>90% Done...
0>92% Done...
0>93% Done...
0>95% Done...
0>96% Done...
0>97% Done...
0>99% Done...
0>INFO:
0> POST Passed all devices.
0>
0>POST: Return to OBP.

SC Alert: Host System has Reset

@(#)OBP 4.13.0 2004/01/19 18:28 Sun Fire V440,Netra 440
Clearing TLBs
POST Results: Cpu 0000.0000.0000.0003
%o0 = 0000.0000.0000.0000 %o1 = ffff.ffff.f00a.3f61 %o2 = ffff.ffff.ffff.ffff
POST Results: Cpu 0000.0000.0000.0002
%o0 = 0000.0000.0000.0000 %o1 = ffff.ffff.f00a.3f61 %o2 = ffff.ffff.ffff.ffff
POST Results: Cpu 0000.0000.0000.0001
%o0 = 0000.0000.0000.0000 %o1 = ffff.ffff.f00a.3f61 %o2 = ffff.ffff.ffff.ffff
POST Results: Cpu 0000.0000.0000.0000
%o0 = 0000.0000.0000.0000 %o1 = ffff.ffff.f00a.3f61 %o2 = ffff.ffff.ffff.ffff
Membase: 0000.0000.0000.0000
MemSize: 0000.0000.0004.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB (small-footprint mode) Done
MMUs ON
Scrubbing Tomatillo tags... 0 1
Find dropin, Copying Done, Size 0000.0000.0000.65f0
PC = 0000.07ff.f000.5400
PC = 0000.0000.0000.54f8
Find dropin, Copying Done, Size 0000.0000.0001.0e70
ttya initialized
CPU 0 Speed: 1281 Mhz, ratio 7:1 , ECCR: f00c00
CPU 1 Speed: 1281 Mhz, ratio 7:1 , ECCR: f00c00
CPU 2 Speed: 1281 Mhz, ratio 7:1 , ECCR: f00c00
CPU 3 Speed: 1281 Mhz, ratio 7:1 , ECCR: f00c00

CPU 0 Memory Configuration: Valid
CPU 1 Memory Configuration: Valid
CPU 2 Memory Configuration: Valid
CPU 3 Memory Configuration: Valid
CPU 0 Bank 0 1024 MB Bank 1 1024 MB Bank 2 1024 MB Bank 3 1024 MB
CPU 1 Bank 0 1024 MB Bank 1 1024 MB Bank 2 1024 MB Bank 3 1024 MB
CPU 2 Bank 0 1024 MB Bank 1 1024 MB Bank 2 1024 MB Bank 3 1024 MB
CPU 3 Bank 0 1024 MB Bank 1 1024 MB Bank 2 1024 MB Bank 3 1024 MB
Master CPU 3 Membase: 3300000000 Memsize: 40000000

@(#)OBP 4.13.0 2004/01/19 18:28 Sun Fire V440,Netra 440
Clearing TLBs
Loading Configuration
Membase: 0000.0033.0000.0000
MemSize: 0000.0000.4000.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB Done
MMUs ON
Scrubbing Tomatillo tags... 0 1
Block Scrubbing Done
Find dropin, Copying Done, Size 0000.0000.0000.65f0
PC = 0000.07ff.f000.5400
PC = 0000.0000.0000.54f8
Find dropin, (copied), Decompressing Done, Size 0000.0000.0006.60c0
ttya initialized
System Reset: CPU Reset (SPOR)
JBUS-PCI bridge
JBUS-PCI bridge
Probing jbus at 0,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 1,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 2,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 3,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 1c,0 pci ppm
Probing jbus at 1d,0 pci
Probing jbus at 1e,0 pci ppm
Probing jbus at 1f,0 pci i2c nvram idprom
Loading Support Packages: kbd-translator obp-tftp SUNW,i2c-ram-device
SUNW,fru-device
Loading onboard drivers:
Probing /pci@1e,600000 Device 7 isa flashprom rtc i2c i2c-bridge
i2c-bridge temperature gpio gpio gpio gpio hardware-monitor
temperature temperature temperature temperature-sensor
motherboard-fru-prom power-supply-fru-prom rmc-fru-prom
scsi-fru-prom power-supply-fru-prom dimm-spd dimm-spd dimm-spd
dimm-spd cpu-fru-prom dimm-spd dimm-spd dimm-spd dimm-spd
cpu-fru-prom dimm-spd dimm-spd dimm-spd dimm-spd cpu-fru-prom
dimm-spd dimm-spd dimm-spd dimm-spd cpu-fru-prom clock-generator
power serial serial serial rmc-comm
Initializing temperature shutdown thresholds for CPUs
CPU 0 Bank 0 base 0 size 1024 MB
CPU 0 Bank 1 base 100000000 size 1024 MB
CPU 0 Bank 2 base 200000000 size 1024 MB
CPU 0 Bank 3 base 300000000 size 1024 MB
CPU 1 Bank 0 base 1000000000 size 1024 MB
CPU 1 Bank 1 base 1100000000 size 1024 MB
CPU 1 Bank 2 base 1200000000 size 1024 MB
CPU 1 Bank 3 base 1300000000 size 1024 MB
CPU 2 Bank 0 base 2000000000 size 1024 MB
CPU 2 Bank 1 base 2100000000 size 1024 MB
CPU 2 Bank 2 base 2200000000 size 1024 MB
CPU 2 Bank 3 base 2300000000 size 1024 MB
CPU 3 Bank 0 base 3000000000 size 1024 MB
CPU 3 Bank 1 base 3100000000 size 1024 MB
CPU 3 Bank 2 base 3200000000 size 1024 MB
CPU 3 Bank 3 base 3300000000 size 1024 MB
Probing /pci@1e,600000 Device 2 Nothing there
Probing /pci@1e,600000 Device 3 Nothing there
Probing /pci@1e,600000 Device 4 Nothing there
Probing /pci@1e,600000 Device 6 pmu gpio
Probing /pci@1e,600000 Device a usb
Probing /pci@1e,600000 Device b usb
Probing /pci@1e,600000 Device d ide disk cdrom
Probing /pci@1f,700000 Device 1 network
Probing /pci@1f,700000 Device 2 scsi disk tape scsi disk tape
Probing /pci@1c,600000 Device 1 Nothing there
Probing /pci@1c,600000 Device 2 network
Probing /pci@1d,700000 Device 1 Nothing there
Probing /pci@1d,700000 Device 2 Nothing there
screen not found.
keyboard not found.
Keyboard not present. Using ttya for input and output.
System Reset: CPU Reset (SPOR)
JBUS-PCI bridge
JBUS-PCI bridge
Probing jbus at 0,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 1,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 2,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 3,0 SUNW,UltraSPARC-IIIi (1281 MHz @ 7:1, 1 MB)
memory-controller
Probing jbus at 1c,0 pci ppm
Probing jbus at 1d,0 pci
Probing jbus at 1e,0 pci ppm
Probing jbus at 1f,0 pci i2c nvram idprom
Loading Support Packages: kbd-translator obp-tftp SUNW,i2c-ram-device
SUNW,fru-device
Loading onboard drivers:
Probing /pci@1e,600000 Device 7 isa flashprom rtc i2c i2c-bridge
i2c-bridge temperature gpio gpio gpio gpio hardware-monitor
temperature temperature temperature temperature-sensor
motherboard-fru-prom power-supply-fru-prom rmc-fru-prom
scsi-fru-prom power-supply-fru-prom dimm-spd dimm-spd dimm-spd
dimm-spd cpu-fru-prom dimm-spd dimm-spd dimm-spd dimm-spd
cpu-fru-prom dimm-spd dimm-spd dimm-spd dimm-spd cpu-fru-prom
dimm-spd dimm-spd dimm-spd dimm-spd cpu-fru-prom clock-generator
power serial serial serial rmc-comm
Initializing temperature shutdown thresholds for CPUs
CPU 0 Bank 0 base 0 size 1024 MB
CPU 0 Bank 1 base 100000000 size 1024 MB
CPU 0 Bank 2 base 200000000 size 1024 MB
CPU 0 Bank 3 base 300000000 size 1024 MB
CPU 1 Bank 0 base 1000000000 size 1024 MB
CPU 1 Bank 1 base 1100000000 size 1024 MB
CPU 1 Bank 2 base 1200000000 size 1024 MB
CPU 1 Bank 3 base 1300000000 size 1024 MB
CPU 2 Bank 0 base 2000000000 size 1024 MB
CPU 2 Bank 1 base 2100000000 size 1024 MB
CPU 2 Bank 2 base 2200000000 size 1024 MB
CPU 2 Bank 3 base 2300000000 size 1024 MB
CPU 3 Bank 0 base 3000000000 size 1024 MB
CPU 3 Bank 1 base 3100000000 size 1024 MB
CPU 3 Bank 2 base 3200000000 size 1024 MB
CPU 3 Bank 3 base 3300000000 size 1024 MB
Probing /pci@1e,600000 Device 2 Nothing there
Probing /pci@1e,600000 Device 3 Nothing there
Probing /pci@1e,600000 Device 4 Nothing there
Probing /pci@1e,600000 Device 6 pmu gpio
Probing /pci@1e,600000 Device a usb
Probing /pci@1e,600000 Device b usb
Probing /pci@1e,600000 Device d ide disk cdrom
Probing /pci@1f,700000 Device 1 network
Probing /pci@1f,700000 Device 2 scsi disk tape scsi disk tape
Probing /pci@1c,600000 Device 1 Nothing there
Probing /pci@1c,600000 Device 2 network
Probing /pci@1d,700000 Device 1 Nothing there
Probing /pci@1d,700000 Device 2 Nothing there

Sun Fire V440, No Keyboard
Copyright 1998-2004 Sun Microsystems, Inc. All rights reserved.
OpenBoot 4.13.0, 16384 MB memory installed, Serial #61229739.
Ethernet address 0:3:ba:a6:4a:ab, Host ID: 83a64aab.

Running diagnostic script obdiag/normal

Testing /pci@1f,700000/network@1
Testing /pci@1e,600000/ide@d
Testing /pci@1e,600000/isa@7/flashprom@2,0
Testing /pci@1e,600000/isa@7/serial@0,2e8
Testing /pci@1e,600000/isa@7/serial@0,3f8
Testing /pci@1e,600000/isa@7/rtc@0,70
Testing /pci@1e,600000/isa@7/i2c@0,320:tests={gpio@0.42,gpio@0.44,gpio@0.46,gpio@0.48}
Testing /pci@1e,600000/isa@7/i2c@0,320:tests={hardware-monitor@0.5c}
Testing /pci@1e,600000/isa@7/i2c@0,320:tests={temperature-sensor@0.9c}
Testing /pci@1c,600000/network@2
Testing /pci@1f,700000/scsi@2,1
Testing /pci@1f,700000/scsi@2

Initializing 1MB of memory at addr 333ff14000 -

Initializing 1MB of memory at addr 333fee0000 -

Initializing 13MB of memory at addr 333f000000 --

Initializing 1008MB of memory at addr 3300000000 -

Initializing 1024MB of memory at addr 3200000000 /-

Initializing 1024MB of memory at addr 3100000000 /-

Initializing 1024MB of memory at addr 3000000000 /-

Initializing 1024MB of memory at addr 2300000000 /-

Initializing 1024MB of memory at addr 2200000000 /-

Initializing 1024MB of memory at addr 2100000000 /-

Initializing 1024MB of memory at addr 2000000000 /-

Initializing 1024MB of memory at addr 1300000000 /-

Initializing 1024MB of memory at addr 1200000000 /-

Initializing 1024MB of memory at addr 1100000000 /-

Initializing 1024MB of memory at addr 1000000000 /-

Initializing 1024MB of memory at addr 300000000 /-

Initializing 1024MB of memory at addr 200000000 /-

Initializing 1024MB of memory at addr 100000000 /-

Initializing 1024MB of memory at addr 0 /-

{3} ok
{3} ok
{3} ok show-post-results
Power On Selftest Passed

I am not able to pinpoint any hardware problem till now. So can any one guide me where can be the problem.

I also would like to share my point of view is that I feel that it's a problem with CPU no. 3. I concluded on the basis that if we search "CPU 0000.0000.0000.0003" keyword on the text which I wrote till now, we will find it twice, one while getting error before system get reset and second we will find it in the diagonstic test output about which I paste it here. Please do let me know whether I am wrong or right.

Regards

try to set-defaults to the OBP PROM?

First of all, thanks for replying. Now I have done that. If you don't mind, could you tell what could be the possible reason for doing this. As I am bit sure that its a hardware problem as sometime server get reset also while booting it from CDROM also .

Before you set-defaults, did you copy out the printenv output?
Any amber LED on the system, or fault shown in the prtdiag -v output?
grep -i warn /var/adm/messages
any files dumped in /var/crash/`hostname` ?

When I run into a hardware problem that I've not experienced before, I generally run SUN vts, then Explorer, then check sunsolve and google. If the host is still under a service contract, I call SUN.

Install VTS - see what it says. Be prepared to let it run (impact system performance - so don't serve anything during testing) for a few hours. Don't be surprised if it doesn't find a problem - this can run for a couple of days before it hits on anything.

If you have the ability to monitor the power that is coming into the host - something that shows spikes (+/-) in power - that's a likely cause of this sort of error.

Then, on a separate host:

  • start a terminal session.
  • type: script /somewhere/date.problem_hostname.capture
  • telnet into the console on the problem host

Ideally, you'll write a small shell script to run prtdiag -v and dmesg every 3 minutes or so. If you have utilities that you like, include them in the script. The next time that the server crashes, let it complete it's power cycle and then see if any new and interesting errors arise. Compare your prtdiag outputs over the course of the hours prior to the crash. See if there are drastic changes in temperature, etc.

Good luck!

First of all thanks for replying. Incredible .. I did every thing what you said to me but didn't find any suspicious. Avronius .. I Am planning to install SUNVTS on my system. I hope I will find the real problem in my hardware.

thanks

Does your prtdiag -v shows any statuses of power supply? If yes, then my following advise will not follow.
If NO, then I suspect it could be an intermitent PSU failure, not necessarily it wil show up in the messages file, but I have "seen" it as a "silent killer" usually with no clues at all.. From experience I guess it could be ok after PSU replacement, but if you think I might be wrong, then Im sorry. :o

Thanks for the reply...Incredible. Well I am pasting the output of 'prtdiag -v'. So Plz guide me.

System Configuration: Sun Microsystems sun4u Sun Fire V440
System clock frequency: 183 MHZ
Memory size: 16GB

==================================== CPUs ====================================
E$ CPU CPU
CPU Freq Size Implementation Mask Status Location
--- -------- ---------- --------------------- ----- ------ --------
0 1281 MHz 1MB SUNW,UltraSPARC-IIIi 2.4 on-line -
1 1281 MHz 1MB SUNW,UltraSPARC-IIIi 2.4 on-line -
2 1281 MHz 1MB SUNW,UltraSPARC-IIIi 2.4 on-line -
3 1281 MHz 1MB SUNW,UltraSPARC-IIIi 2.4 on-line -

================================= IO Devices =================================
Bus Freq Slot + Name +
Type MHz Status Path Model
------ ---- ---------- ---------------------------- --------------------
pci 66 MB pci108e,abba (network) SUNW,pci-ce
okay /pci@1c,600000/network@2

pci 33 MB isa/su (serial)
okay /pci@1e,600000/isa@7/serial@0,3f8

pci 33 MB isa/su (serial)
okay /pci@1e,600000/isa@7/serial@0,2e8

pci 33 MB isa/rmc-comm-rmc_comm (seria+
okay /pci@1e,600000/isa@7/rmc-comm@0,3e8

pci 33 MB pci10b9,5229 (ide)
okay /pci@1e,600000/ide@d

pci 66 MB pci108e,abba (network) SUNW,pci-ce
okay /pci@1f,700000/network@1

pci 66 MB scsi-pci1000,30 (scsi-2) LSI,1030
okay /pci@1f,700000/scsi@2

pci 66 MB scsi-pci1000,30 (scsi-2) LSI,1030
okay /pci@1f,700000/scsi@2,1

============================ Memory Configuration ============================
Segment Table:
-----------------------------------------------------------------------
Base Address Size Interleave Factor Contains
-----------------------------------------------------------------------
0x0 4GB 16 BankIDs 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
0x1000000000 4GB 16 BankIDs 16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31
0x2000000000 4GB 16 BankIDs 32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47
0x3000000000 4GB 16 BankIDs 48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63

Bank Table:
-----------------------------------------------------------
Physical Location
ID ControllerID GroupID Size Interleave Way
-----------------------------------------------------------
0 0 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
1 0 0 256MB
2 0 1 256MB
3 0 1 256MB
4 0 0 256MB
5 0 0 256MB
6 0 1 256MB
7 0 1 256MB
8 0 1 256MB
9 0 1 256MB
10 0 0 256MB
11 0 0 256MB
12 0 1 256MB
13 0 1 256MB
14 0 0 256MB
15 0 0 256MB
16 1 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
17 1 0 256MB
18 1 1 256MB
19 1 1 256MB
20 1 0 256MB
21 1 0 256MB
22 1 1 256MB
23 1 1 256MB
24 1 1 256MB
25 1 1 256MB
26 1 0 256MB
27 1 0 256MB
28 1 1 256MB
29 1 1 256MB
30 1 0 256MB
31 1 0 256MB
32 2 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
33 2 0 256MB
34 2 1 256MB
35 2 1 256MB
36 2 0 256MB
37 2 0 256MB
38 2 1 256MB
39 2 1 256MB
40 2 1 256MB
41 2 1 256MB
42 2 0 256MB
43 2 0 256MB
44 2 1 256MB
45 2 1 256MB
46 2 0 256MB
47 2 0 256MB
48 3 0 256MB 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
49 3 0 256MB
50 3 1 256MB
51 3 1 256MB
52 3 0 256MB
53 3 0 256MB
54 3 1 256MB
55 3 1 256MB
56 3 1 256MB
57 3 1 256MB
58 3 0 256MB
59 3 0 256MB
60 3 1 256MB
61 3 1 256MB
62 3 0 256MB
63 3 0 256MB

Memory Module Groups:
--------------------------------------------------
ControllerID GroupID Labels Status
--------------------------------------------------
0 0 C0/P0/B0/D0
0 0 C0/P0/B0/D1
0 1 C0/P0/B1/D0
0 1 C0/P0/B1/D1
1 0 C1/P0/B0/D0
1 0 C1/P0/B0/D1
1 1 C1/P0/B1/D0
1 1 C1/P0/B1/D1
2 0 C2/P0/B0/D0
2 0 C2/P0/B0/D1
2 1 C2/P0/B1/D0
2 1 C2/P0/B1/D1
3 0 C3/P0/B0/D0
3 0 C3/P0/B0/D1
3 1 C3/P0/B1/D0
3 1 C3/P0/B1/D1

============================ Environmental Status ============================
Fan Status:
-------------------------------------------
Location Sensor Status
-------------------------------------------
FT0/F0 TACH okay
FT1/F0 TACH okay
FT1/F1 TACH okay
PS0 FF_PDCT_FAN okay
PS1 FF_PDCT_FAN okay

Temperature sensors:
-----------------------------------------
Location Sensor Status
-----------------------------------------
C0/P0 T_CORE okay
C1/P0 T_CORE okay
C2/P0 T_CORE okay
C3/P0 T_CORE okay
C0 T_AMB okay
C1 T_AMB okay
C2 T_AMB okay
C3 T_AMB okay
SCSIBP T_AMB okay
MB T_AMB okay
------------------------------------
Current sensors:
----------------------------------------
Location Sensor Status
----------------------------------------
MB FF_SCSIA okay
MB FF_SCSIB okay
MB FF_POK okay
C0/P0 FF_POK okay
C1/P0 FF_POK okay
C2/P0 FF_POK okay
C3/P0 FF_POK okay
------------------------------------
Voltage sensors:
-----------------------------------
Location Sensor Status
-----------------------------------
MB V_+1V5 okay
MB V_VCCTM okay
MB V_NET0_1V2D okay
MB V_NET1_1V2D okay
MB V_NET0_1V2A okay
MB V_NET1_1V2A okay
MB V_+3V3 okay
MB V_+3V3STBY okay
MB/BAT V_BAT okay
MB V_SCSI_CORE okay
MB V_+5V okay
MB V_+12V okay
MB V_-12V okay
PS0 P_PWR okay
PS0 FF_POK okay
PS1 P_PWR okay
PS1 FF_POK okay
-----------------------------------------
Keyswitch:
-----------------------------------------
Location Keyswitch State
-----------------------------------------
SYS SYSCTRL NORMAL
--------------------------------------------------
Led State:
--------------------------------------------------------------
Location Led State Color
--------------------------------------------------------------
SYS ACT on green
SYS SERVICE off amber
SYS LOCATE off white
PS0 POK on green
PS0 STBY on green
PS0 SERVICE off amber
PS0 OK2RM off blue
PS1 POK on green
PS1 STBY on green
PS1 SERVICE off amber
PS1 OK2RM off blue
HDD0 SERVICE off amber
HDD0 OK2RM off blue
HDD1 SERVICE off amber
HDD1 OK2RM off blue
HDD2 SERVICE off amber
HDD2 OK2RM off blue
HDD3 SERVICE off amber
HDD3 OK2RM off blue

=========================== FRU Operational Status ===========================
---------------------------------
Fru Operational Status:
---------------------------------
Location Status
---------------------------------
SC okay
HDD0 present
HDD1 present
HDD2 present
PS0 okay
PS1 okay

================================ HW Revisions ================================
ASIC Revisions:
-------------------------------------------------------------------
Path Device Status Revision
-------------------------------------------------------------------
/pci@1c,600000 pci108e,a801 okay 4
/pci@1d,700000 pci108e,a801 okay 4
/pci@1e,600000 pci108e,a801 okay 4
/pci@1f,700000 pci108e,a801 okay 4

System PROM revisions:
----------------------
OBP 4.13.0 2004/01/19 18:28 Sun Fire V440,Netra 440
OBDIAG 4.13.0 2004/01/19 18:30

Regards

your prtdiag looks ok. try updating your firmware from 4.13.0 to 4.22.x (121685 OBP )
download the firmware from sunsolve website, follow the instructions in readme file. we'll monitor if this improves the situation first. :o