SCSI Enumeration failed

ENVIRONMENT
OS = SUN SPARC 11.4
Machine Model = SPARC T4-1

ISSUE

WARNING: scsi_enumeration_failed: mpt_sas3 probe@w50000393b832c7aa,0 enumeration failed during probe
WARNING: scsi_tgtmap_scsi_deactivate: mpt_sas0 iport@1 w50000393b832c7aa SCSI enumeration failed, no more retries until config change occurs

We tried replacing 2 more drives but same status

I think that you ought to explain the history here. Was the system ever worked? Or it is new? If it used to work, what happened?

The disk controller in a T4-1 is SAS. Are you sure that your replacement disks are SAS? (and not SATA).

It might be a faulty ribbon cable.

Try connecting just one disk first and see if the error disappears, and then add one disk at a time.

Do the disks have Sun labels written?

2 Likes

1- The SPARC machine was running fine and was in production. Below is snapshot of machine

2- The disk which we replaced was an old disk from SUN SPARC machine and from our stock and we don't have more hardware of SPARC to check the old disk). Below is the old disk snapshot

3- Below is the disk snapshot which was in production server and working fine

4- The Disk Bay of SPARC machine is connected to SPARC motherboard with 2 ribbons (half Disk Bay is connected with one ribbon and half Disk Bay is connected with other ribbon). We have checked the new and old disks with both ribbons and the status is same. Below is the snapshot where you can see both Disk ribbons is connecting the Disk Bay to motherboard

5- The command show faulty don't return any error or message

A problem with the mpt_sas controller?
Perhaps a missing forceload for the mpt driver in /etc/system or /etc/system.d/ files?

Do you please help me how to force read the SAS controller, as I cant able to go after OK prompt.

{0} ok probe-scsi-all
/pci@400/pci@2/pci@0/pci@f/pci@0/usb@0,2/hub@2/hub@3/storage@2
Unit 0 Removable Read Only device AMI Virtual CDROM 1.00

/pci@400/pci@2/pci@0/pci@4/scsi@0

FCode Version 1.00.61, MPT Version 2.00, Firmware Version 9.05.00.00

Target 9
Unit 0 Removable Read Only device TEAC DV-W28SS-W 1.0A
SATA device PhyNum 6

/pci@400/pci@1/pci@0/pci@4/scsi@0

FCode Version 1.00.61, MPT Version 2.00, Firmware Version 9.05.00.00

Target 9
Unit 0 Disk HP DG0146BALVN HPD2
SASDeviceName 5000c5000acf77ff SASAddress 5000c5000acf77fd PhyNum 1

{0} ok

{0} ok select /pci@400/pci@1/pci@0/pci@4/scsi@0
{0} ok show-volumes
No volumes to show
{0} ok unselect-dev

{0} ok select /pci@400/pci@2/pci@0/pci@4/scsi@0
{0} ok show-volumes
No volumes to show
{0} ok unselect-dev

{0} ok show-disks
a) /pci@400/pci@2/pci@0/pci@f/pci@0/usb@0,2/hub@2/hub@3/storage@2/disk
b) /pci@400/pci@2/pci@0/pci@4/scsi@0/disk
c) /pci@400/pci@1/pci@0/pci@4/scsi@0/disk
d) /iscsi-hba/disk
q) NO SELECTION
Enter Selection, q to quit: b
/pci@400/pci@2/pci@0/pci@4/scsi@0/disk has been selected.
Type ^Y ( Control-Y ) to insert it in the command line.
e.g. ok nvalias mydev ^Y
for creating devalias mydev for /pci@400/pci@2/pci@0/pci@4/scsi@0/disk
{0} ok /pci@400/pci@2/pci@0/pci@4/scsi@0/disk
/pci@400/pci@2/pci@0/pci@4/scsi@0/disk ?
{0} ok boot disk
Boot device: /pci@400/pci@1/pci@0/pci@4/scsi@0/disk@p0 File and args:
ERROR: boot-read fail

Can't open boot device

{0} ok show-disks
a) /pci@400/pci@2/pci@0/pci@f/pci@0/usb@0,2/hub@2/hub@3/storage@2/disk
b) /pci@400/pci@2/pci@0/pci@4/scsi@0/disk
c) /pci@400/pci@1/pci@0/pci@4/scsi@0/disk
d) /iscsi-hba/disk
q) NO SELECTION
Enter Selection, q to quit: c
/pci@400/pci@1/pci@0/pci@4/scsi@0/disk has been selected.
Type ^Y ( Control-Y ) to insert it in the command line.
e.g. ok nvalias mydev ^Y
for creating devalias mydev for /pci@400/pci@1/pci@0/pci@4/scsi@0/disk
{0} ok /pci@400/pci@1/pci@0/pci@4/scsi@0/disk
/pci@400/pci@1/pci@0/pci@4/scsi@0/disk ?
{0} ok format
format ?
{0} ok boot disk
Boot device: /pci@400/pci@1/pci@0/pci@4/scsi@0/disk@p0 File and args:
ERROR: boot-read fail

Can't open boot device

Do you know if the disks were configured as a hardware RAID such as RAID0 or RAID1E?

No. I dont remember. I had only one disk on machine. in case I want to leave data concern on disk, is there a way forward to reinstall OS ? I tried to install the new OS but could not do it

With only one disk in the system you could not have any RAID array.

What happened? Did it just say there was no hard disk to install to?

{0} ok show-children

Does this also say no disks?

Looks like it detects the HP DG0146BALVN
as
/pci@400/pci@1/pci@0/pci@4/scsi@0/disk
but cannot boot from
/pci@400/pci@1/pci@0/pci@4/scsi@0/disk@p0
that must be partition 0
A problem with not having a Sun-Sparc label?
That's a common problem with non-Sun/Oracle disks, they ship with a x86 PC label ("MBR").
You would need to boot Sparc-Solaris from a CDROM and then you can run "format" that will suggest to write a Sun-Sparc label to it.

1 Like

{0} ok show-children
show-children ?
{0} ok

Ref:

Raid controller commands:

image

image

image

image

From the Solaris install screen, can you escape to a root shell?
If yes, at the # prompt you can run the "format".

is it possible to escape to root shell while Solaris install screen :slight_smile: ?

There is usually an option that allows you to escape to a root shell from the install routine, yes.

However, an alternative way is to boot from the installation media into single user:

ok> boot cdrom -s

which will avoid the install routine and definitely drop you to a root shell (#) prompt from which you can run commands. Try that to run 'format'.

Talking generically for a minute, the T4-1 has a hardware RAID controller on-board and you are trying to get a new/different hard drive working. So the checking order needs to be something like this:

  1. Enter the RAID controller BIOS and check that the new disk is 'seen' and, although it is the only drive in the system, is set to 'pass through' by that RAID controller BIOS, i.e. is not manipulated in any way by that RAID controller.
  2. Next, check that the motherboard/OS can see the drive (using 'format') and, if so, that it shows the expected geometry (certainly the expected drive capacity) and has a Sun disk label. If not, write a Sun disk label to it. Once it has been 'seen' by the OS in single user mode, then you should be able to install Solaris onto it by booting the DVD normally thereby entering the install routine.
1 Like
User selected: English
Configuring devices.
WARNING: No retained memory, deferred dump not available
Hostname: solaris
Welcome to the Oracle Solaris installation menu

        1  Install Oracle Solaris
        2  Install Additional Drivers
        3  Shell
        4  Terminal type (currently xterm)
        5  Reboot

Please enter a number [1]: 3
To return to the main menu, exit the shell
root@solaris:/root# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0t5000C50007595A7Bd0 <drive type unknown>
          /scsi_vhci/disk@g5000c50007595a7b
          /dev/chassis/SYS/HDD1/disk
       1. c1t5000C5000ACF77FDd0 <drive type unknown>
          /pci@400/pci@1/pci@0/pci@4/scsi@0/iport@8/disk@w5000c5000acf77fd,0
          /dev/chassis/SYS/HDD3/disk
Specify disk (enter its number):
SUNW-MSG-ID: DISK-8000-ES, TYPE: Fault, VER: 1, SEVERITY: Critical
EVENT-TIME: Thu Mar  2 12:44:47 UTC 2023
PLATFORM: ORCL,SPARC-T4-1, CSN: AK00118201, HOSTNAME: solaris
SOURCE: eft, REV: 1.16
EVENT-ID: cd94e41b-91a2-4d73-a7a4-e435648369c2
DESC: A non-recoverable drive failure was detected by the device while performing a command.
AUTO-RESPONSE: The device may be offlined or degraded. A hot-spare disk may have been activated.
IMPACT: The device has failed. The service may have been lost or degraded.
REC-ACTION: Please refer to the associated reference document at http://support.oracle.com/msg/DISK-8000-ES for the latest service procedures and policies regarding this diagnosis.

SUNW-MSG-ID: DISK-8000-ES, TYPE: Fault, VER: 1, SEVERITY: Critical
EVENT-TIME: Thu Mar  2 12:45:41 UTC 2023
PLATFORM: ORCL,SPARC-T4-1, CSN: AK00118201, HOSTNAME: solaris
SOURCE: eft, REV: 1.16
EVENT-ID: 2010eb02-4893-4cbc-89cf-b0f84d4d1ec8
DESC: A non-recoverable drive failure was detected by the device while performing a command.
AUTO-RESPONSE: The device may be offlined or degraded. A hot-spare disk may have been activated.
IMPACT: The device has failed. The service may have been lost or degraded.
REC-ACTION: Please refer to the associated reference document at http://support.oracle.com/msg/DISK-8000-ES for the latest service procedures and policies regarding this diagnosis.

SUNW-MSG-ID: DISK-8000-4Q, TYPE: Fault, VER: 1, SEVERITY: Critical
EVENT-TIME: Thu Mar  2 12:48:19 UTC 2023
PLATFORM: ORCL,SPARC-T4-1, CSN: AK00118201, HOSTNAME: solaris
SOURCE: eft, REV: 1.16
EVENT-ID: 7c26aa5e-b5b8-48b9-b734-f30a7ea6c1ce
DESC: A medium error was detected by the device that was non-recoverable.
AUTO-RESPONSE: The device may be offlined or degraded. A hot-spare disk may have been activated.
IMPACT: It is likely that continued operation will result in data corruption, which may eventually cause the loss of service or the service degradation.
REC-ACTION: Please refer to the associated reference document at http://support.oracle.com/msg/DISK-8000-4Q for the latest service procedures and policies regarding this diagnosis.

SAME BELOW THING HAPPEN WITH BOTH DRIVES AFTER FORMAT COMMAND. FAILED TO SET THE TYPE OF BOTH DISKS


 root@solaris:/root# format
 Searching for disks...done


 AVAILABLE DISK SELECTIONS:
        0. c0t5000C50007595A7Bd0 <drive type unknown>
           /scsi_vhci/disk@g5000c50007595a7b
           /dev/chassis/SYS/HDD1/disk
        1. c1t5000C5000ACF77FDd0 <drive type unknown>
           /pci@400/pci@1/pci@0/pci@4/scsi@0/iport@8/disk@w5000c5000acf77fd,0
           /dev/chassis/SYS/HDD3/disk
 Specify disk (enter its number): 1
 <drive type unknown>


 AVAILABLE DRIVE TYPES:
         0. Auto configure
         1. other
 Specify disk type (enter its number): 0
 Auto configure failed


 FORMAT MENU:
         disk       - select a disk
         type       - select (define) a disk type
         partition  - select (define) a partition table
         current    - describe the current disk
         format     - format and analyze the disk
         repair     - repair a defective sector
         label      - write label to the disk
         analyze    - surface analysis
         defect     - defect list management
         backup     - search for backup labels
         verify     - read and display labels
         inquiry    - show disk ID
         volname    - set 8-character volume name
         !<cmd>     - execute <cmd>, then return
         quit
 format>

@z_haseeb,
I edited your last post with the proper markdown code tags.
For the future... use ``` for multi-line BLOCKS of data /code, and NOT for EACH LINE in the block.

If you have a SINGLE line you'd like to "quote/highlight" use a single ` surrounding the set of chars on each line, e.g. `here's the quoted data/code segment` which gets formatted as: here's the quoted data/code segment.
HTH

Even better: wrap inline text in ``` then it may even contain a ` (backtick).

Have never met this "medium error" in format.
You can try the setting of an environment variable, as suggested in

  1. Set and export the NOINUSE_CHECK variable:
    root@ # setenv NOINUSE_CHECK=1
    root@ # export NOINUSE_CHECK
  2. Run the format utility to restore the drive's "type".
    root@ # format
    Searching for disks...done
    ...
root@solaris:/root# tcsh
# setenv NOINUSE_CHECK 1

exit
root@solaris:/root# export NOINUSE_CHECK

Dont know why I am not able to set the type of disk and second I am only able to see 2 types offered by the format command. see below for reference

root@solaris:/root# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0t5000C50085D805EBd0 <drive type unknown>
          /scsi_vhci/disk@g5000c50085d805eb
          /dev/chassis/SYS/HDD1/disk
Specify disk (enter its number): 0
<drive type unknown>


AVAILABLE DRIVE TYPES:
        0. Auto configure
        1. other
Specify disk type (enter its number): 0
Auto configure failed


FORMAT MENU:
        disk       - select a disk
        type       - select (define) a disk type
        partition  - select (define) a partition table
        current    - describe the current disk
        format     - format and analyze the disk
        repair     - repair a defective sector
        label      - write label to the disk
        analyze    - surface analysis
        defect     - defect list management
        backup     - search for backup labels
        verify     - read and display labels
        inquiry    - show disk ID
        volname    - set 8-character volume name
        !<cmd>     - execute <cmd>, then return
        quit
format> type


AVAILABLE DRIVE TYPES:
        0. Auto configure
        1. other
Specify disk type (enter its number): 0
Auto configure failed
format>

Try "other" and select a suitable label (the first one is usually the best I think).
Then type "label" to write it.