Solaris boot issues

Hello,
One of our old solaris Box is not able to boot, we are having those messages

Boot device: 
/pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w5000cca0435991e5,0:a  File and args:
WARNING: cannot open system file: /etc/system
SunOS Release 5.10 Version Generic_147440-10 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.

panic[cpu0]/thread=180e000: read_binding_file: /etc/name_to_major file not found

000000000180b640 genunix:read_binding_file+2d8 (1945024, 199c968, 1248ee4, 7ffffc00, 7530, 12a7400)
  %l0-3: fffffcfffd6a6008 fffffcfffd6a6000 ffffffffffffffff 0000000000000000
  %l4-7: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
000000000180b800 genunix:mod_setup+1c (18ab800, 18ab800, 0, 3c00, 1248c00, 199c800)
  %l0-3: 0000000000000000 00000000018a9400 000000000000752b 00000000018a9400
  %l4-7: 0000000000007530 0000000000000005 00000000018ef400 000000000183f800
000000000180b8b0 unix:startup_modules+24 (1914c00, 1c00000, 1878800, 1847400, 80000, 1043400)
  %l0-3: 0000000001918000 000000000190d000 000000000199c400 0000000000000000
  %l4-7: 000000000183ac00 000000000183c400 000000000184ac00 00000000001b0000
000000000180b960 unix:startup+28 (2, 1, 1, 1, 1070800, 1043000)
  %l0-3: 0000000010031305 03b9aca000000000 000000003b8f5850 000000000000001c
  %l4-7: 00000000000003e8 000000003b9e9a8f 000000003b8f5850 00000000010708e4
000000000180ba10 genunix:main+24 (0, 180c000, 18a6840, 10b9c00, 1846058, 1999000)
  %l0-3: 0000000000000001 0000000070002000 0000000070002000 0000000000000000
  %l4-7: 00000000018ef000 0000000000000000 000000000180c000 0000000000000060

skipping system dump - no dump device configured

can someone help please

I have never seen this before.
It boots the kernel but does not see the required files in /etc
Does it use the SVM(Solaris volume manager, metastat command etc.)?
If yes, it could be an inconsistency in SVM.
It could also be an inconsistency in the Solaris boot archive. In this case try from OBP(OpenBootProm, OK prompt)
boot -F failsafe
After a number of failed boots the OBP memory can become corrupted, then run
reset-all
to clear it.

Hello MadeInGermany,
Thank you for your feedback,
But i tried already boot -F failsafe,
i'm having those outputs

{0} ok boot -F failsafe
Boot device: /pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w5000cca0435991e5,0:a  File and args: -F failsafe
SunOS Release 5.10 Version Generic_147440-01 64-bit
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Configuring devices.
Searching for installed OS instances...
No installed OS instance found.

Starting shell.
#

seems it's not seeing the OS.
I'm searching for a procedure for repairing this OS disk.
i tried to mount the root filesystem while in failsafe but there are only four files/directory in /a/etc (/a is the mount point) (

# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
       0. c0t5000CCA0435B9394d0 <HITACHI-H109060SESUN600G-A31A cyl 64986 alt 2 hd 27 sec 668>  solaris
          /scsi_vhci/disk@g5000cca0435b9394
       1. c0t5000CCA0435991E4d0 <HITACHI-H109060SESUN600G-A31A cyl 64986 alt 2 hd 27 sec 668>  solaris
          /scsi_vhci/disk@g5000cca0435991e4
       2. c0t5000CCA0436743FCd0 <HITACHI-H109060SESUN600G-A31A cyl 64986 alt 2 hd 27 sec 668>  solaris
          /scsi_vhci/disk@g5000cca0436743fc
       3. c0t5000CCA043674454d0 <HITACHI-H109060SESUN600G-A31A cyl 64986 alt 2 hd 27 sec 668>  solaris
          /scsi_vhci/disk@g5000cca043674454
       4. c2t0d0 <DGC-RAID10-0531 cyl 63998 alt 2 hd 256 sec 64>
          /pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w500601683ee04631,0
       5. c2t0d1 <DGC-RAID10-0531 cyl 61438 alt 2 hd 256 sec 40>
          /pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w500601683ee04631,1
       6. c2t0d2 <DGC-RAID10-0531 cyl 58494 alt 2 hd 256 sec 64>
          /pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w500601683ee04631,2
       7. c2t0d3 <DGC-RAID10-0531 cyl 58494 alt 2 hd 256 sec 64>
          /pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w500601683ee04631,3
       8. c2t0d4 <DGC-RAID10-0531 cyl 58494 alt 2 hd 256 sec 64>
          /pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w500601683ee04631,4
       9. c2t0d5 <DGC-RAID10-0531 cyl 58494 alt 2 hd 256 sec 64>
          /pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w500601683ee04631,5
      10. c2t0d6 <DGC-RAID10-0531 cyl 58494 alt 2 hd 256 sec 64>
          /pci@400/pci@2/pci@0/pci@0/SUNW,qlc@0/fp@0,0/ssd@w500601683ee04631,6
- hit space for more or s to select -
# f -^C
# df -h
Filesystem             size   used  avail capacity  Mounted on
/ramdisk-root:a        203M   181M   2.3M    99%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                   248G   320K   248G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
sharefs                  0K     0K     0K     0%    /etc/dfs/sharetab
swap                   248G   1.6M   248G     1%    /tmp
/tmp/dev               248G   1.6M   248G     1%    /dev
fd                       0K     0K     0K     0%    /dev/fd
# cd /dev/dsk
# ls |more
c0t5000CCA0435991E4d0s0
c0t5000CCA0435991E4d0s1
c0t5000CCA0435991E4d0s2
c0t5000CCA0435991E4d0s3
c0t5000CCA0435991E4d0s4
c0t5000CCA0435991E4d0s5
c0t5000CCA0435991E4d0s6
c0t5000CCA0435991E4d0s7
c0t5000CCA0435B9394d0s0
c0t5000CCA0435B9394d0s1
c0t5000CCA0435B9394d0s2
c0t5000CCA0435B9394d0s3
c0t5000CCA0435B9394d0s4
c0t5000CCA0435B9394d0s5
c0t5000CCA0435B9394d0s6
c0t5000CCA0435B9394d0s7
c0t5000CCA0436743FCd0s0
c0t5000CCA0436743FCd0s1
c0t5000CCA0436743FCd0s2
c0t5000CCA0436743FCd0s3
c0t5000CCA0436743FCd0s4
c0t5000CCA0436743FCd0s5

# prtvtoc /dev/rdsk/c0t5000CCA0435991E4d0s0
* /dev/rdsk/c0t5000CCA0435991E4d0s0 (volume "solaris") partition map
*
* Dimensions:
*     512 bytes/sector
*     668 sectors/track
*      27 tracks/cylinder
*   18036 sectors/cylinder
*   64988 cylinders
*   64986 accessible cylinders
*
* Flags:
*   1: unmountable
*  10: read-only
*
*                          First     Sector    Last
* Partition  Tag  Flags    Sector     Count    Sector  Mount Directory
       0      2    00  557655084 307207188 864862271
       1      3    01          0 557655084 557655083
       2      5    00          0 1172087496 1172087495
       5      0    00  864862272 307207188 1172069459
       7      0    00  1172069460     18036 1172087495
# mount /dev/dsk/c0t5000CCA0435991E4d0s0 /a
# cd /a
# ls
core              net               reprocess_ccn.sh  tmp
dev               opt               rmdisk            usr
devices           oradiag_root      sbin              var
etc               othnodebackup     secng32           vol
home              platform          sudesh.txt
lost+found        proc              system
# cd /etc
# ^C
# cd /a/etc
# ls
dfs       mnttab    svc       sysevent

The first thing I would do (if you haven't already) is to 'fsck' that root filesystem after booting from the installation media:

ok> boot cdrom -s

Then:

# fsck -n <disk file system>

to see what damage, if any, there is. Obviously (1) don't mount the filesystem; needs to be unmounted, and (2) BE SURE to use the -n switch (no write) to see if there's damage without trying to fix anything to allow you time to consider next step if it's seriously damaged, otherwise, further damage might be done before you can blink.

1 Like

This looks like somebody did a rm * there:
all files are gone, only directories are left.

1 Like

Is there any file that may contain the history of the commands to check if a rm * was run ?
Is it possible to rebuild these files (/etc files) with cdrom ?

Boot from CDROM into single user and restore files (or complete filesystem) from backup.

@maodomd , not sure where Solaris keeps its log files, possibly under the /var/... tree

By default in Solaris 10 root's home is / so look in /a

ls -l /a/.*history*

Perhaps .sh_history

I don't see the boot device; 5000cca0435991e5 listed in the format output nor in the ls | more output.

I do see a very similarly device name; 5000cca0435991e4 listed in the output, but not the `5000cca0435991e5' in the boot device name.

Based on the device path; /pci@400/pci@2/pci@0/pci@e/scsi@0/disk@w5000cca0435991e5,0:a, it looks like it's probably a directly attached disk.

How many drives are physically in the system?

Do you have access to a backup copy of /etc/vfstab to see where root was mounted?

This hints at a problem with a directly attached drive to me.

1 Like

The disk is present 5000cca0435991e5 seems to be SASAdress of 5000cca0435991e4, which is available

{0} ok probe-scsi

FCode Version 1.00.64, MPT Version 2.00, Firmware Version 9.00.00.00

Target 9
Unit 0 Disk HITACHI H109060SESUN600G A606 1172123568 Blocks, 600 GB
SASDeviceName 5000cca0436743fc SASAddress 5000cca0436743fd PhyNum 0
Target a
Unit 0 Disk HITACHI H109060SESUN600G A606 1172123568 Blocks, 600 GB
SASDeviceName 5000cca0435991e4 SASAddress 5000cca0435991e5 PhyNum 1
Target b
Unit 0 Disk HITACHI H109060SESUN600G A606 1172123568 Blocks, 600 GB
SASDeviceName 5000cca043674454 SASAddress 5000cca043674455 PhyNum 2
Target c
Unit 0 Disk HITACHI H109060SESUN600G A606 1172123568 Blocks, 600 GB
SASDeviceName 5000cca0435b9394 SASAddress 5000cca0435b9395 PhyNum 3

@maodomd I've had problems booting with the wrong SAS address; ...4 vs ...5, before. I don't remember the particulars, just that using the other address allowed the system to boot normally.

I would encourage you to try to boot using SASDeviceName, 5000cca0435991e4, instead of the SASAddress, 5000cca0435991e5, and see if that makes any difference.

At least if the system is still down.

I tried already to force boot with that disk without success.
Actually we are trying to restore root fs with a tape that has been found (an old backup) but we are having these errors (tar: directory checksum error) :

mt -f /dev/rmt/0 status

HP Ultrium LTO 4 tape drive:
sense key(0x0)= No Additional Sense residual= 0 retries= 0
file no= 0 block no= 0

mt status

HP Ultrium LTO 4 tape drive:
sense key(0x0)= No Additional Sense residual= 0 retries= 0
file no= 0 block no= 0

tar tvf /dev/rmt/0n

tar: directory checksum error

Even ufsrestore seems to not work:

mt -f /dev/rmt/0mbn stat

HP Ultrium LTO 4 tape drive:
sense key(0x0)= No Additional Sense residual= 0 retries= 0
file no= 0 block no= 0

tar -tvf /dev/rmt/0m

tar: directory checksum error

ufsrestore ivf /dev/rmt/0n

Verify volume and initialize maps
Media block size is 20
Volume is not in dump format

In my early Unix life I always put a piece of paper in the tape box, stating date,format,purpose...

Perhaps you can copy the tape contents to a file?
Then examine the file with the file command.

cat < /dev/rmt/0 > tapefile
file tapefile
2 Likes

Finally was able to read cartridge after tape drive several restarts :wink:
Trying to find a way to restore root folders. Ufsrestore is not working because its not dump tape.

Are tar, cpio, or pax able to recognize anything on the tape now that you can read it?

As @MadeInGermany says, dumping (the start of) the tape to a file and running file on it may give you some more insight.

ood and xxd might also provide some insight, but I find their availability to be inconsistent.

I was wondered if tar can help to restore whole folder like /etc /usr, because all the options i see seems related to specific files

Yes, you can use 'patterns' (search for 'tar patterns') to restore specific files or directory trees.

First you need to know whether the archive (tar = Tape ARchive) has files stored absolute path starting with / or relative path starting with .

Then you can construct a pattern, e.g. ./etc/* to restore all files under ./etc for example. The pattern has to match the path stored on the tape. Use the -t switch to list the tape to see. The -t will not restore anything, only list the tape, so won't do any damage.

Obviously, if you are going to restore using a relative path pattern then you need to be in the directory where you want those files to go, e.g.

# cd /
# tar -xvf <archive device> ./etc/*

Omit the . if the path is absolute. This would restore everything from /etc downwards.

Any other question just ask. There's loads of people on here who can help.

A glob pattern for tar must be in quotes, because the shell should not expand it. The shell will hand the dequoted pattern to tar.

tar -xvf <archive device> "./etc/*"
tar -xvf <archive device> "etc/*"

The pattern must exactly match the stored pathnames; there is a difference between ./etc and etc and /etc

tar is recursive, it is enough to just give the top directory(s), and tar will extract all subfolders and all files in the tree.

tar -xvf <archive device> "./etc"
tar -xvf <archive device> "etc"

It makes sense to first browse the full archive or a part of it.

tar -tvf <archive device>
tar -tvf <archive device> "etc"
tar -tvf <archive device> "/etc" "/usr"

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.