Fibre Channel link not ready on Netra 240

Hi,
One of my Netra 240 went into hung state and I had to reboot it. I powered it off and tried booting it again but unsuccessful. It is not connected to SAN and have local disks. Not able to boot in failsafe mode too.
There are two disks of 72GB, both are mirrored in SVM. It complains about Fibre channel failure, but what seems to be failed here on both disks ? This doesn't look like disk issue. Being old server, this server is not in Oracle support. I can buy replacement part, if I know, what may be failed part here.

{1} ok probe-scsi-all
/pci@1c,600000/scsi@2,1

/pci@1c,600000/scsi@2
Target 0
  Unit 0   Disk     SEAGATE ST373453LSUN72G 0449
Target 1
  Unit 0   Disk     SEAGATE ST373453LSUN72G 0449

/pci@1e,600000/SUNW,qlc@3
Fibre Channel link not ready - Loss of Sync
Fibre Channel link down

/pci@1e,600000/SUNW,qlc@2
Fibre Channel link not ready - Loss of Sync
Fibre Channel link down

{1} ok

I did diag-level max and power-cycle server, but its output is not giving me much clue. Here is output of obdiag

obdiag> test-all
Hit the spacebar to interrupt testing
Testing /pci@1e,600000/SUNW,qlc@2 Fibre Channel link not ready - Loss of Sync
Starting memory test - press any key to terminate test
HBA memory size - 131072 bytes
        Testing memory, pattern 1 /
        Testing memory, pattern 2 \
        Testing memory, pattern 3 /
        Testing memory, pattern 4 \
        Testing memory, pattern 5 /
        Testing memory, pattern 6 \
        Testing memory, pattern 7 /
        Testing memory, pattern 8 \
        Online Self-test Passed
        1-bit Internal Loopback  Test Passed
        10-bit Internal Loopback Test Passed
        External Loopback        Test Passed
Error: /pci@1e,600000/SUNW,qlc@2 selftest resulted in net stack depth change of -1
Selftest at /pci@1e,600000/SUNW,qlc@2 ................................. failed
Testing /pci@1e,600000/SUNW,qlc@3 Fibre Channel link not ready - Loss of Sync
Starting memory test - press any key to terminate test
HBA memory size - 131072 bytes
        Testing memory, pattern 1 /
        Testing memory, pattern 2 \
        Testing memory, pattern 3 /
        Testing memory, pattern 4 \
        Testing memory, pattern 5 /
        Testing memory, pattern 6 \
        Testing memory, pattern 7 /
        Testing memory, pattern 8 \
        Online Self-test Passed
        1-bit Internal Loopback  Test Passed
        10-bit Internal Loopback Test Passed
        External Loopback        Test Passed
Error: /pci@1e,600000/SUNW,qlc@3 selftest resulted in net stack depth change of -1
Selftest at /pci@1e,600000/SUNW,qlc@3 ................................. failed
Testing /pci@1e,600000/isa@7/flashprom@2,0 ............................ passed
Testing /pci@1e,600000/isa@7/i2c@0,320 ................................ passed
Testing /pci@1e,600000/ide@d .......................................... passed
Testing /pci@1d,700000/pci@1/pci@0/network@0 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@0/network@0: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@0/network@0: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@0/network@0: link down
.......................... passed
Testing /pci@1d,700000/pci@1/pci@0/network@1 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@0/network@1: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@0/network@1: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@0/network@1: link down
.......................... passed
Testing /pci@1f,700000/network@2 ...................................... passed
Testing /pci@1d,700000/network@2 ...................................... passed
Testing /pci@1d,700000/pci@1/pci@4/network@2 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@4/network@2: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@4/network@2: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@4/network@2: link down
.......................... passed
Testing /pci@1f,700000/network@2,1 .................................... passed
Testing /pci@1d,700000/network@2,1 .................................... passed
Testing /pci@1d,700000/pci@1/pci@4/network@3 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@4/network@3: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@4/network@3: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@4/network@3: link down
.......................... passed
Testing /pci@1e,600000/isa@7/rmc-comm@0,3e8 ........................... passed
Testing /pci@1e,600000/isa@7/rtc@0,70 ................................. passed
Testing /pci@1c,600000/scsi@2 ......................................... passed
Testing /pci@1c,600000/scsi@2,1 ....................................... passed
Testing /pci@1e,600000/isa@7/serial@0,2e8 ............................. passed
Testing /pci@1e,600000/isa@7/serial@0,3f8 ............................. passed
Pass:1 (of 1) Errors:2 (of 2) Tests Failed:2 Elapsed Time: 0:0:4:45


Hit any key to return to the main menu
 _____________________________________________________________________________
|                                 o b d i a g                                 |
|_________________________ _________________________ _________________________|
|                         |                         |                         |
|  1 SUNW,qlc@2           |  2 SUNW,qlc@3           |  3 flashprom@2,0        |
|  4 i2c@0,320            |  5 ide@d                |  6 network@0            |
|  7 network@1            |  8 network@2            |  9 network@2            |
| 10 network@2            | 11 network@2,1          | 12 network@2,1          |
| 13 network@3            | 14 rmc-comm@0,3e8       | 15 rtc@0,70             |
| 16 scsi@2               | 17 scsi@2,1             | 18 serial@0,2e8         |
| 19 serial@0,3f8         |                         |                         |
|_________________________|_________________________|_________________________|
|       Commands: test test-all except help what setenv set-default exit      |
|_____________________________________________________________________________|
|                   diag-passes=1 diag-level=max test-args=                   |
|_____________________________________________________________________________|

obdiag>

Thanks in advance

Well, yes, it is complaining about a Fibre Channel failure. You have a qlc adapter (QLogic OEM board) in there. If you haven't already done so, I would be inclined to remove that adapter from its slot and just plug it in again. It might just be poor contacts with the backplane. Also, unplug/replug the FC cable(s) at both ends.

Does the qlc BIOS announce itself on boot? If so, can you key the required sequence (CTRL-Q if I remember correctly) and enter the BIOS setup?

1 Like

Just now, I opened the box, resetted FC cables, rebooted the server and it is still complaining about same error.
While booting, it doesn't tell me, where I can go in BIOS setup. Below is output of post boot. Can you see something which I missed ?

If I assume that card may be bad, do I need to replace MB, because this card is on -board ?

 ******  POST Running ******
                                                                                                                                                                                            Done
0>PLL Reset....\
 ******  POST Running ******
                                                                                                                                                                                            Done
0>Init Memory....Done
0>Test Memory....Done
0>Test CPU Caches....Done
0>Functional CPU Tests....Done
0>IO-Bridge Tests....Done
0>INFO:
0>      POST Passed all devices.
0>
0>POST: Return to OBP.

SC Alert: Host System has Reset

SC Alert: CRITICAL ALARM is set

@(#)OBP 4.16.2 2004/10/04 18:22 Sun Fire V210/V240,Netra 240
Clearing TLBs
POST Results: Cpu 0000.0000.0000.0001
  %o0 = 0000.0000.0000.0000 %o1 = ffff.ffff.f00a.4c9e %o2 = ffff.ffff.ffff.ffff
POST Results: Cpu 0000.0000.0000.0000
  %o0 = 0000.0000.0000.0000 %o1 = ffff.ffff.f00a.4c9e %o2 = ffff.ffff.ffff.ffff
Membase: 0000.0000.0000.0000
MemSize: 0000.0000.0004.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB (small-footprint mode) Done
MMUs ON
Scrubbing Tomatillo tags... 0 1
Find dropin, Copying Done, Size 0000.0000.0000.6cb0
PC = 0000.07ff.f000.5ba8
PC = 0000.0000.0000.5c58
Find dropin, Copying Done, Size 0000.0000.0001.15c0
Diagnostic console initialized
Configuring system memory & CPU(s)
Programming IMAX
GPIO config is 5

CPU 0 Speed: 1503 Mhz, ratio  9:1 , ECCR: f00c00
CPU 1 Speed: 1503 Mhz, ratio  9:1 , ECCR: f00c00
CPU 0 Memory Configuration: Valid
CPU 1 Memory Configuration: Valid
CPU 0 Bank 0 1024 MB Bank 1 <empty> Bank 2 1024 MB Bank 3 <empty>
CPU 1 Bank 0 1024 MB Bank 1 <empty> Bank 2 1024 MB Bank 3 <empty>
Master CPU 1 Membase: 1200000000 Memsize: 40000000


@(#)OBP 4.16.2 2004/10/04 18:22 Sun Fire V210/V240,Netra 240
Clearing TLBs
Loading Configuration

Membase: 0000.0012.0000.0000
MemSize: 0000.0000.4000.0000
Init CPU arrays Done
Init E$ tags Done
Setup TLB Done
MMUs ON
Scrubbing Tomatillo tags... 0 1
Block Scrubbing Done
Find dropin, Copying Done, Size 0000.0000.0000.6cb0
PC = 0000.07ff.f000.5ba8
PC = 0000.0000.0000.5c58
Find dropin, (copied), Decompressing Done, Size 0000.0000.0006.6870
Diagnostic console initialized
System Reset: CPU Reset (SPOR)
Probing system devices
jbus at 0,0 SUNW,UltraSPARC-IIIi (1503 MHz @ 9:1, 1 MB) memory-controller
jbus at 1,0 SUNW,UltraSPARC-IIIi (1503 MHz @ 9:1, 1 MB) memory-controller
jbus at 1f,0 pci
jbus at 1e,0 pci
jbus at 1c,0 pci
jbus at 1d,0 pci
Loading Support Packages: kbd-translator obp-tftp SUNW,i2c-ram-device SUNW,fru-device SUNW,asr
Loading onboard drivers:
/pci@1e,600000: Device 7 isa
/pci@1e,600000/isa@7: flashprom rtc i2c power serial serial serial rmc-comm
/pci@1e,600000/isa@7/i2c@0,320: i2c-bridge i2c-bridge motherboard-fru-prom chassis-fru-prom alarm-fru-prom power-supply-fru-prom power-supply-fru-prom dimm-spd dimm-spd dimm-spd dimm-spd dimm-spd dimm-spd dimm-spd dimm-spd rscrtc nvram idprom gpio gpio gpio gpio gpio gpio
Probing memory
CPU 0 Bank 0 base          0 size 1024 MB
CPU 0 Bank 2 base  200000000 size 1024 MB
CPU 1 Bank 0 base 1000000000 size 1024 MB
CPU 1 Bank 2 base 1200000000 size 1024 MB
SUNW,Netra-240 Probing I/O buses
/pci@1d,700000: Device 2 network network
/pci@1f,700000: Device 2 network network
/pci@1e,600000: Device 6 pmu i2c gpio
/pci@1e,600000/pmu@6/i2c@0,0:
/pci@1e,600000: Device a usb
/pci@1e,600000: Device d ide disk cdrom
/pci@1e,600000: Device 2 SUNW,qlc fp disk
/pci@1e,600000: Device 3 SUNW,qlc fp disk
/pci@1c,600000: Device 2 scsi disk tape scsi disk tape
/pci@1c,600000: Device 1 Nothing there
/pci@1d,700000: Device 1 pci
/pci@1d,700000/pci@1: Device 0 pci
/pci@1d,700000/pci@1/pci@0: Device 0 network
/pci@1d,700000/pci@1/pci@0: Device 1 network
/pci@1d,700000/pci@1/pci@0: Device 2 Nothing there
/pci@1d,700000/pci@1/pci@0: Device 3 Nothing there
/pci@1d,700000/pci@1/pci@0: Device 4 Nothing there
/pci@1d,700000/pci@1/pci@0: Device 5 Nothing there
/pci@1d,700000/pci@1/pci@0: Device 6 Nothing there
/pci@1d,700000/pci@1/pci@0: Device 7 Nothing there
/pci@1d,700000/pci@1/pci@0: Device 8 Nothing there
/pci@1d,700000/pci@1/pci@0: Device 9 Nothing there
/pci@1d,700000/pci@1/pci@0: Device a Nothing there
/pci@1d,700000/pci@1/pci@0: Device b Nothing there
/pci@1d,700000/pci@1/pci@0: Device c Nothing there
/pci@1d,700000/pci@1/pci@0: Device d Nothing there
/pci@1d,700000/pci@1/pci@0: Device e Nothing there
/pci@1d,700000/pci@1/pci@0: Device f Nothing there
/pci@1d,700000/pci@1: Device 1 Nothing there
/pci@1d,700000/pci@1: Device 2 Nothing there
/pci@1d,700000/pci@1: Device 3 Nothing there
/pci@1d,700000/pci@1: Device 4 pci
/pci@1d,700000/pci@1/pci@4: Device 0 Nothing there
/pci@1d,700000/pci@1/pci@4: Device 1 Nothing there
/pci@1d,700000/pci@1/pci@4: Device 2 network
/pci@1d,700000/pci@1/pci@4: Device 3 network
/pci@1d,700000/pci@1/pci@4: Device 4 Nothing there
/pci@1d,700000/pci@1/pci@4: Device 5 Nothing there
/pci@1d,700000/pci@1/pci@4: Device 6 Nothing there
/pci@1d,700000/pci@1/pci@4: Device 7 Nothing there
/pci@1d,700000/pci@1/pci@4: Device 8 Nothing there
/pci@1d,700000/pci@1/pci@4: Device 9 Nothing there
/pci@1d,700000/pci@1/pci@4: Device a Nothing there
/pci@1d,700000/pci@1/pci@4: Device b Nothing there
/pci@1d,700000/pci@1/pci@4: Device c Nothing there
/pci@1d,700000/pci@1/pci@4: Device d Nothing there
/pci@1d,700000/pci@1/pci@4: Device e Nothing there
/pci@1d,700000/pci@1/pci@4: Device f Nothing there
/pci@1d,700000/pci@1: Device 5 Nothing there
/pci@1d,700000/pci@1: Device 6 Nothing there
/pci@1d,700000/pci@1: Device 7 Nothing there
/pci@1d,700000/pci@1: Device 8 Nothing there
/pci@1d,700000/pci@1: Device 9 Nothing there
/pci@1d,700000/pci@1: Device a Nothing there
/pci@1d,700000/pci@1: Device b Nothing there
/pci@1d,700000/pci@1: Device c Nothing there
/pci@1d,700000/pci@1: Device d Nothing there
/pci@1d,700000/pci@1: Device e Nothing there
/pci@1d,700000/pci@1: Device f Nothing there

Netra 240, No Keyboard
Copyright 1998-2004 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.16.2, 4096 MB memory installed, Serial #65526705.
Ethernet address 0:3:ba:e7:db:b1, Host ID: 83e7dbb1.




Running diagnostic script obdiag/normal

Testing /pci@1e,600000/ide@d
Testing /pci@1e,600000/isa@7/rtc@0,70
Testing /pci@1c,600000/scsi@2
Testing /pci@1c,600000/scsi@2,1
Testing /pci@1e,600000/isa@7/serial@0,2e8
Testing /pci@1e,600000/isa@7/serial@0,3f8



{1} ok

Well I can't see too much wrong with that POST output apart from the CRITICAL ALARM which you need to reset.

sc> setalarm critical off

Does test-all still give the same error(s) as your post#1?

Yes, still giving same errors

{1} ok sc>
sc> setalarm critical off
sc> console -f
Warning: User <auto> currently has write permission to this console and forcibly removing them will terminate any current write actions and all work will be lost.  Would you like to continue? [y/n]y
Enter #. to return to ALOM.

{1} ok obdiag
Searching for selftest methods: network network flashprom rtc i2c serial serial rmc-comm ide SUNW,qlc SUNW,qlc scsi scsi network network network network network network
 _____________________________________________________________________________
|                                 o b d i a g                                 |
|_________________________ _________________________ _________________________|
|                         |                         |                         |
|  1 SUNW,qlc@2           |  2 SUNW,qlc@3           |  3 flashprom@2,0        |
|  4 i2c@0,320            |  5 ide@d                |  6 network@0            |
|  7 network@1            |  8 network@2            |  9 network@2            |
| 10 network@2            | 11 network@2,1          | 12 network@2,1          |
| 13 network@3            | 14 rmc-comm@0,3e8       | 15 rtc@0,70             |
| 16 scsi@2               | 17 scsi@2,1             | 18 serial@0,2e8         |
| 19 serial@0,3f8         |                         |                         |
|_________________________|_________________________|_________________________|
|       Commands: test test-all except help what setenv set-default exit      |
|_____________________________________________________________________________|
|                   diag-passes=1 diag-level=max test-args=                   |
|_____________________________________________________________________________|

obdiag> test-all
Hit the spacebar to interrupt testing
Testing /pci@1e,600000/SUNW,qlc@2 Fibre Channel link not ready - Loss of Sync
Starting memory test - press any key to terminate test
HBA memory size - 131072 bytes
        Testing memory, pattern 1 /
        Testing memory, pattern 2 \
        Testing memory, pattern 3 /
        Testing memory, pattern 4 \
        Testing memory, pattern 5 /
        Testing memory, pattern 6 \
        Testing memory, pattern 7 /
        Testing memory, pattern 8 \
        Online Self-test Passed
        1-bit Internal Loopback  Test Passed
        10-bit Internal Loopback Test Passed
        External Loopback        Test Passed
Error: /pci@1e,600000/SUNW,qlc@2 selftest resulted in net stack depth change of -1
Selftest at /pci@1e,600000/SUNW,qlc@2 ................................. failed
Testing /pci@1e,600000/SUNW,qlc@3 Fibre Channel link not ready - Loss of Sync
Starting memory test - press any key to terminate test
HBA memory size - 131072 bytes
        Testing memory, pattern 1 /
        Testing memory, pattern 2 \
        Testing memory, pattern 3 /
        Testing memory, pattern 4 \
        Testing memory, pattern 5 /
        Testing memory, pattern 6 \
        Testing memory, pattern 7 /
        Testing memory, pattern 8 \
        Online Self-test Passed
        1-bit Internal Loopback  Test Passed
        10-bit Internal Loopback Test Passed
        External Loopback        Test Passed
Error: /pci@1e,600000/SUNW,qlc@3 selftest resulted in net stack depth change of -1
Selftest at /pci@1e,600000/SUNW,qlc@3 ................................. failed
Testing /pci@1e,600000/isa@7/flashprom@2,0 ............................ passed
Testing /pci@1e,600000/isa@7/i2c@0,320 ................................ passed
Testing /pci@1e,600000/ide@d .......................................... passed
Testing /pci@1d,700000/pci@1/pci@0/network@0 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@0/network@0: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@0/network@0: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@0/network@0: link down
.......................... passed
Testing /pci@1d,700000/pci@1/pci@0/network@1 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@0/network@1: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@0/network@1: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@0/network@1: link down
.......................... passed
Testing /pci@1f,700000/network@2 ...................................... passed
Testing /pci@1d,700000/network@2 ...................................... passed
Testing /pci@1d,700000/pci@1/pci@4/network@2 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@4/network@2: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@4/network@2: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@4/network@2: link down
.......................... passed
Testing /pci@1f,700000/network@2,1 .................................... passed
Testing /pci@1d,700000/network@2,1 .................................... passed
Testing /pci@1d,700000/pci@1/pci@4/network@3 Register tests: passed
Internal loopback test: passed
/pci@1d,700000/pci@1/pci@4/network@3: Timed out waiting for Auto-Negotation to complete
/pci@1d,700000/pci@1/pci@4/network@3: Cannot establish link via Auto-Negotation
Please check cable and/or connection
/pci@1d,700000/pci@1/pci@4/network@3: link down
.......................... passed
Testing /pci@1e,600000/isa@7/rmc-comm@0,3e8

Well it would be a bit of a tragedy if you replace the motherboard but this one is fixable. The diags recognises that there's a QLogic chipset on the PCI bus but announces a failure. Take a look at the enablecomponent command in case the device has been accidentally blacklisted. Also, the QLogic BIOS announcement at boot time may be/could be disabled so try (several times) to enter CTRL-Q as the box boots to see if you can enter the QLogic BIOS setup. That would at least prove that it's talking.

You have a QLogic chipset that is being recognised but not passing diags.

Since you say this is an on-board adapter then it could mean a motherboard change but that's a bit drastic if it's caused by just some setting.