commands hanging when querying hardware

Hi all

Got another strange one. If I try to enquire about the hardware, the command hangs implying Ive got a hardware issue. So, if I execute :-

iostat -en
sysdef - ( stops at the devices part )
format
cfgadm -al

Anything that searches the devices, then the command hangs.

The server has two se3310 arrays attached via dual bus configuration. A disk had failed on there, so I thought this was causing the issue. This was replaced, but still thesre commands dont work, i.e. device issue

I init 0, reset-all, probe-scsi-all to see if devices were visible, and they were, all , poweroff / poweron, probe-scsi-all, boot -s but still something is causing these issues on the device bus.

Probe-scsi-all :-

{1} ok probe-scsi-all
This command may hang the system if a Stop-A or halt command
has been executed. Please type reset-all to reset the system
before executing this command.
Do you wish to continue? (y/n) y
/pci@1f,700000/scsi@2,1

/pci@1f,700000/scsi@2
Volume 0
Unit 0 Disk LSILOGIC1030 IM IM1000 143374706 Blocks, 70007 MB
Target 2
Unit 0 Disk SEAGATE ST373307LSUN72G 0707 143374738 Blocks, 70007 MB
Target 3
Unit 0 Disk SEAGATE ST373307LSUN72G 0707 143374738 Blocks, 70007 MB

/pci@1d,700000/pci@2/scsi@5

/pci@1d,700000/pci@2/scsi@4
Target 0
Unit 0 Device type d SUN StorEdge 3310 0325
Unit 1 Disk SUN StorEdge 3310 0325 <------------ disks
Unit 2 Disk SUN StorEdge 3310 0325
Unit 3 Disk SUN StorEdge 3310 0325
Unit 4 Disk SUN StorEdge 3310 0325
Target 1
Unit 0 Device type d SUN StorEdge 3310 0325 - SBK
??????????????????????? <------------ no disks mentioned although this array is a mirror of the above
???????????????????????
???????????????????? ??

/pci@1c,600000/pci@1/scsi@5

/pci@1c,600000/pci@1/scsi@4
Target 0
Unit 0 Device type d SUN StorEdge 3310 0325
Target 1
Unit 0 Device type d SUN StorEdge 3310 0325

Look at the line with SBK next to it. I reckon I should see addtional lines underneath referring to the disks in that array, but I dont. So, Im assuming Ive got issues eith that array. ?? What do u guys reckon ?

Any ideas what I can look for ? use to find the offending device ?

SBK

you should not type yes "y" when it prompts that it'll that it'll hang
{1} ok probe-scsi-all
This command may hang the system if a Stop-A or halt command
has been executed. Please type reset-all to reset the system
before executing this command.
Do you wish to continue? (y/n) n
{1} ok
{1} ok reset-all
{1} ok probe-scsi-all
This command may hang the system if a Stop-A or halt command
has been executed. Please type reset-all to reset the system
before executing this command.
Do you wish to continue? (y/n) y

You must ask SUN engineer on which devices path it hanged, probably you have to change the system board or reset the NVRAM

sbk1972 - this might be something i've seen before. if you run say prtconf or really anything that queries the hardware, the system drops or crashes. can you get us some details about the system like `uname -a`. it might be a patch upgrade.

pupp might be right.. for my answer, I suspected there was some power outage issue that was cusing the "hang", so I provided that ...

Morning gents, hope you all had a good weekend.

Thanks for your replies.

Incredible. The probe-scsi-all command, I did this after setting auto-boot? to false, and doing a reset-all, etc etc etc. I missed a few of the commands out when I cut/ pasted. I know about the probe-scsi, when you havent done the reset-all senario, so always do that first.

pupp - you could be on to something regarding the patchlevel :- XXXXXXXXXX 5.9 Generic_122300-29 sun4u sparc SUNW,Sun-Fire-V440

If I did a cfgadm -al / sysdef -i / prtconf it would hang, suggesting that a device isnt respongding, infact the monitoring script I cron off every hour, was also waiting, so if you did a ps -ef | more, you would see tons of iostat / metastat / etc etc, as none of these commands were running, and were sleeping.

I did a reboot, didnt work, so did the ok promp checks. Not sure, but will go visit the server today and find out.

Looking in the /var/adm/messages file I see :-

Apr 27 07:42:14 XXXXXXXXXXX SUNWscsdMonitor[541]: [ID 610721 daemon.error] [SUNWscsd 0x02060B01:0x00000000 Warning] <rctrl5000> Unable to open I/O transport layer (dev = eth 172.XX.XX.XX,58632,65537,8F3644134C9999452B7E29FEF8452B7E29FEF8452B7E29FEF8452B7E29FEF8452B7E29FEF8452B7E29FEF8452B7E29FEF8452B7E29FEF8452B7E29FEF8452B7E). {Unique ID#: 08288e}

SBK