e2900 hardware issue?

Hi,

I am fairly new to Solaris , i have an e2900 server which gives the following output on the console:

Fri Sep 23 08:20:57 noname.example.com lom: [ID 658854 local0.error] /N0/SB0 : F
ailed AR interconnect test. Status = 00060040
Fri Sep 23 08:21:00 noname.example.com lom: [ID 221660 local0.error] AR Intercon
nect test: System board SB0/ar0 connection to system board /N0/IB6 failed
Fri Sep 23 08:21:03 noname.example.com lom: [ID 443716 local0.error] SB0/ar0 Bit
in error L1_L1_CMDIN [6]
Fri Sep 23 08:21:03 noname.example.com lom: [ID 221660 local0.error] AR Intercon
nect test: System board SB0/ar0 connection to system board /N0/IB6 failed
Fri Sep 23 08:21:03 noname.example.com lom: [ID 634901 local0.error] SB0/ar0 Bit
in error L1_L1_CMDOUT [6]
Fri Sep 23 08:21:04 noname.example.com lom: [ID 955567 local0.error] DX Intercon
nect test: System board SB0/dx0 Dx-AR pause line connection to system board(s)
/N0/RP0 failed
Fri Sep 23 08:21:04 noname.example.com lom: [ID 728037 local0.error] SB0/dx0 Bit
in error Global_Oring_Out_B [6]
Fri Sep 23 08:21:04 noname.example.com lom: [ID 944370 local0.error] DX Intercon
nect test: System board SB0/dx0 Dx-AR pause line connection to system board(s)
/N0/IB6 failed
Fri Sep 23 08:21:04 noname.example.com lom: [ID 203749 local0.error] SB0/dx0 Bit
in error Global_Oring_Out_B [2]
Fri Sep 23 08:21:04 noname.example.com lom: [ID 815479 local0.error] /N0/IB6 : F
ailed AR interconnect test. Status = 3000007f
Fri Sep 23 08:21:04 noname.example.com lom: [ID 106121 local0.error] AR Intercon
nect test: System board IB6/ar0 address repeater connections to system board RP0
/ar0 failed
Fri Sep 23 08:21:04 noname.example.com lom: [ID 429614 local0.error] IB6/ar0 Bit
in error L2_ADDR [39]
Fri Sep 23 08:21:04 noname.example.com lom: [ID 167470 local0.error] IB6/ar0 Bit
in error L2_ADDR [37]
Fri Sep 23 08:21:04 noname.example.com lom: [ID 936379 local0.error] IB6/ar0 Bit
in error L2_ADDR [36]
Fri Sep 23 08:21:05 noname.example.com lom: [ID 805307 local0.error] IB6/ar0 Bit
in error L2_ADDR [35]
Fri Sep 23 08:21:05 noname.example.com lom: [ID 674235 local0.error] IB6/ar0 Bit
in error L2_ADDR [34]
Fri Sep 23 08:21:05 noname.example.com lom: [ID 106121 local0.error] AR Intercon
nect test: System board IB6/ar0 address repeater connections to system board RP0
/ar0 failed

Any ideas of what may be wrong? I suspect its the IO assembly board IB6 however another opinion would be very welcome.

Thanks in advance

Steve

should/could be the ib or ib_ssc. you should open a call with oracle to investigate the problem...

1 Like

Problem is i do not have Oracle support! Oh well thanks for your help

Hi steve_b72, if you have done configuration changes of firmware updates, perhaps you have a configuration problem, but I'm afraid that you have a machine with a hardware error.

1 Like

The errors shown below are all relating to either SB0 or IB6. What is your system configuration?

The system built-in diagnostics depends highly on the firmware version - the newer, the better. So what FW version are you running? What does the following commands display (Run on the LOM / system controller)
showfault
showcomponent -v sb0
showcomponent -v ib6
showfru
showplatform
showplatform -v
showboards -p boards -v
showboards -p io -v

Does your boards all have the same firmware on them?
showboards -p version -v

You can try the following:
power off the system
carefuly re-seat all the components. USE ANTI-STATIC PROTECTION.
power on again.

1 Like

Hi,

Thanks everyone for your help. On removing IB_6 i found some seriously bent pins. I need a new IB_6 card and the base unit replaced , ouch thats gonna cost! Thanks again for your help.

Steve

if this is just a test server or a non production box, i know its not exactly best practice but can you get something and straighten the pins out to make it work? Nothing to lose by trying.

Only guessing this is non production as you dont have support for it.

yep this is just a test server, have tried straightening the pins ...lets see i'll post the results.
Whenever i close my eyes all i see are pins, pins and more pins ....sad i know

---------- Post updated at 07:44 AM ---------- Previous update was at 07:40 AM ----------

Hi,

seems to get further but still have issues:

Mon Sep 26 05:29:30 noname.example.com lom: [ID 106121 local0.error] AR Intercon
nect test: System board IB6/ar0 address repeater connections to system board RP0
/ar0 failed
Mon Sep 26 05:29:30 noname.example.com lom: [ID 565544 local0.error] IB6/ar0 Bit
 in error L2_INCOMING [1]  
Mon Sep 26 05:29:30 noname.example.com lom: [ID 469926 local0.error] IB6/ar0 Bit
 in error L2_PREREQ [1]  
Mon Sep 26 05:29:30 noname.example.com lom: [ID 363931 local0.error] IB6/ar0 Bit
 in error L2_MASK [9]  
Mon Sep 26 05:29:30 noname.example.com lom: [ID 232859 local0.error] IB6/ar0 Bit
 in error L2_MASK [8]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 101787 local0.error] IB6/ar0 Bit
 in error L2_MASK [7]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 870696 local0.error] IB6/ar0 Bit
 in error L2_MASK [6]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 739624 local0.error] IB6/ar0 Bit
 in error L2_MASK [5]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 608552 local0.error] IB6/ar0 Bit
 in error L2_MASK [4]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 477480 local0.error] IB6/ar0 Bit
 in error L2_MASK [3]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 346408 local0.error] IB6/ar0 Bit
 in error L2_MASK [2]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 215336 local0.error] IB6/ar0 Bit
 in error L2_MASK [1]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 984245 local0.error] IB6/ar0 Bit
 in error L2_MASK [0]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 106121 local0.error] AR Intercon
nect test: System board IB6/ar0 address repeater connections to system board RP0
/ar0 failed
Mon Sep 26 05:29:31 noname.example.com lom: [ID 855719 local0.error] IB6/ar0 Bit
 in error L2_CMD [1]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 724647 local0.error] IB6/ar0 Bit
 in error L2_CMD [0]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 908178 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [8]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 777106 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [7]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 646034 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [6]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 514962 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [5]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 383890 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [4]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 252818 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [3]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 121746 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [2]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 890655 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [1]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 759583 local0.error] IB6/ar0 Bit
 in error L2_ATRANSID [0]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 980104 local0.error] DX Intercon
nect test: System board IB6/dx1 Dx-AR  pause line connection to system board(s) 
 /N0/RP0 failed
Mon Sep 26 05:29:31 noname.example.com lom: [ID 826185 local0.error] IB6/dx1 Bit
 in error Global_Oring_Out_B [6]  
Mon Sep 26 05:29:31 noname.example.com lom: [ID 537567 local0.error] PCI I/O Boa
rd at /N0/IB6 has been removed from domain A due to a failure in interconnection
 test. Service action required.
Mon Sep 26 05:29:40 noname.example.com lom: [ID 991241 local0.error] No usable I
o board in domain.