boot failure-init died

Hello.

system fails on reboot this AM.
received message about not possible for kernel to find process that caused crash.
system does memory dump succesfully, then tries to boot again.
looking further at messages during this time, i'm getting 'init died with return value 256'...system panic init died.
also says to check inits execute permission, inits location, and the root partitions location.

nothing of note done to system to cause this yesterday.

system is older hpux 10.20 version (i'm helping company liquidate, so need to laugh/gag about o/s version please.)

system keeps cycling through this process...can't get past.
any assistance appreciated.

Drew

I would start by seeing if I can boot manually in single user:

boot <pri>
interact with IPL > yes
and at ISL prompt > hpux -is

If that works, then try to mount the vg00 filesystems...
Then check for any corruption in files like /etc/inittab /etc/ioctl,syscon etc... their timestamps etc...
look at
-r-xr-xr-x 1 bin bin 278528 May 20 1998 /sbin/init

If all seems normal try init 2
then init 3 ...
If things do wrong again after changing runlevel start all again but once mounted vg00 edit /etc/initta and comment out from powerf line
and change runlevel again or execute the line commented out one after the other...till you fall on the one giving trouble...

Good luck

trying to boot in single user still hangs at same place.

How did you go single user? I ment by single user - In maintenance mode (No disk yet...)
What model?

during boot, choose to go to IPL.
choose hpux -iS
with interactive IPL help.

same problem.
HPUX10.20
hp9000/800 k class server.

trying to boot from ignite tape gives me ipl error: bad LIF magic.
tape seems to be readable (cleaned drive to be sure).

gonna try & find older ignite for booting as well...
any other suggestions?

Well yes...
Try again with "hpux -lm"
If this doesnt work, since you have not yet mounted any file system, we can forget about Corrupt /stand/vmunix /etc/inittab /sbin/init ...
It leaves us :Missing LIF , corrupt root filesystem and bad BRDA...(that is what comes to my mind at the moment...)

Your guess?

Mind you thinking of it - you havent tried to boot from alternate kernel:
"hpux /stand/vmunix.prev" but I woulndt think this works since tape boot was unsuccessfull

While at it - I will go and look on HP ITRC and keep you informed...
Do you have any support from HP or an account?

Here is something for you! From HP Technical knowledge base:

How do I fix the HPUX boot message panic: init died?

How do I fix the HPUX boot message panic: init died?
DocId:

KBRC00001547

Updated:

6/24/00 7:48:09 AM

PROBLEM

A system panics on boot with a message init died:

init died with return value 256.
Please check for init's execute permission,
init's location and the root partition's location.

panic: (display==0xb800, flags==0x0) init died

RESOLUTION

Their are various causes and possible fixes for panic "init died".

Below is a summary of possible causes:
o Corrupt LVM Boot Data Reserved Area (BDRA)
o Corrupt Root filesystem
o Corrupt Autoboot file
o Missing LIF
o Corrupt /stand/vmunix
o Corrupt /etc/inittab
o Corrupt /sbin/init

Here is a list of other documents in the ITRC you might need to help with
this recovery.

How do I fix the boot message IPL error bad LIF magic at 10.X? (KBRC00001074)
How do I fix the boot message IPL error bad LIF magic at 11.X? (KBRC00001355)
How do I boot my system from Support Media on 10.X ? (OZBEKBRC00000581)
How do I boot my system from Support Media on 11.00 ? (OZBEKBRC00000582)
How do I boot my system in single user mode or other mode ? (OZBEKBRC00000607)

1.0 We know the system panics when booting /stand/vmunix. We must
be able to boot an alternate way.

  If you need specific help on booting please refer to:
  How do I boot my system in single user mode or other mode ?

1.1 First try booting into Maintenance mode on the primary disk.
Boot primary disk and interact with ISL.
At the ISL prompt enter "hpux -lm"
If this works you will get a prompt.
Perform steps 2.0, 2.1, 2.3, 2.6 and 2.7

1.2 If you were unable to boot the system from step 1.1 then try booting
the alternate kernel on the primary disk.
Boot primary disk and interact with ISL.
At the ISL prompt enter "hpux /stand/vmunix.prev"
If this works the system will boot the previous kernel.
Perform steps 2.3 and 2.5

1.3 If you were unable to boot the system from step 1.2 then try
booting the alternate kernel in Maintenance mode
Boot primary disk and interact with ISL.
At the ISL prompt enter "hpux -lm /stand/vmunix.prev"
If this works you will get a prompt.
Perform steps 2.0, 2.1, 2.3, 2.5, 2.6 and 2.7

1.4 If you were unable to boot the system from step 1.3 then try booting
from the mirror (if configured) disk and do not interact with ISL.
If this works the system will boot and sync the primary disk.
Perform steps 2.1 and 2.3

1.5 If you were unable to boot the system from step 1.4 then try booting
from the mirror (if configured) disk and interact with ISL
At the ISL prompt enter "hpux -lm"
If this works you will get a prompt.
Perform steps 2.0, 2.1, 2.3, 2.6 and 2.7

1.6 If you were unable to boot the system from step 1.5 then try booting
from the alternate kernel off the mirror disk (if configured).
Boot the mirror disk and interact with ISL.
At the ISL prompt enter "hpux /stand/vmunix.prev"
If this works the system will boot the previous kernel.
Perform steps 2.3 and 2.5

1.7 If you were unable to boot the system from step 1.6 then try booting
from the alternate kernel in Maintenance mode off the mirror disk
(if configured).
Boot the mirror disk and interact with ISL.
At the ISL prompt enter "hpux -lm /stand/vmunix.prev"
If this works you will get a prompt.
Perform steps 2.0, 2.1, 2.3, 2.5, 2.6 and 2.7

1.8 If you were unable to boot the system from the above steps then
boot the "SUPPORT MEDIA" to make the corrections.

  Follow the instructions on how to boot the SUPPORT media, activate
  VG00 and mount the required lvols from:

  How do I boot my system from Support Media on 10.X ?
  How do I boot my system from Support Media on 11.00 ?

  Perform steps 2.1, 2.2, 2.3, 2.4, 2.5, 2.6 and 2.7

Below are the steps to correct specific issues.

2.0 How to activate vg00
After booting "-lm" mode you will need to activate VG00, perform
a filesystem check and mount the lvols in VG00. Then we will have
the commands necessary to peform the recovery.

 enter "vgchange -a y /dev/vg00"    This will activate VG00

 enter "fsck -y"      This will perform a filesystem check of all
                      lvols in the fstab file. We are not concerned
                      about errors on any volume group besides VG00
                      since they were not activated they will fail.

 enter "mount -a"     This will perform a mount of all lvols in
                      the fstab file. We are not concerned about
                      errors on any volume group besides VG00
                      since they were not activated they will fail.

 enter "/usr/bin/bdf" This will show the mounted lvols.

 Now you have the needed HPUX commands.

2.1 How to correct the LVM Boot Data Reserved Area (BDRA)
Typically, if the BDRA is corrupt, the system will still boot in
maintenance mode, but fail to boot in single-user or multi-user mode.
To check the BDRA, you will need to boot maintenance mode or boot
from the Support CD.

 Check the the Boot, Root, Swap, and Dump areas.  Use lvlnboot to add
 any missing information.  One common problem is the missing Boot lvol.
 The following example assumes JFS is used for the root filesystem
 \(meaning you have a seperate lvol for /stand\).

 enter "lvlnboot -v /dev/vg00"

 This sample shows a primary boot disk and a mirror boot disk.

 Boot Definitions for Volume Group /dev/vg00:
 Physical Volumes belonging in Root Volume Group:
   /dev/dsk/c0t6d0 \(10/0.6.0\) -- Boot Disk  &lt;--- Primary boot disk
   /dev/dsk/c0t5d0 \(10/0.5.0\) -- Boot Disk  &lt;--- Mirror boot disk
  Boot: lvol1     on:     /dev/dsk/c0t6d0
  Boot: lvol1     on:     /dev/dsk/c0t5d0 &lt;---------- Mirror boot disk
  Root: lvol3     on:     /dev/dsk/c0t6d0
  Root: lvol3     on:     /dev/dsk/c0t5d0 &lt;---------- Mirror boot disk
  Swap: lvol2     on:     /dev/dsk/c0t6d0
  Swap: lvol2     on:     /dev/dsk/c0t5d0 &lt;---------- Mirror boot disk
  Dump: lvol2     on:     /dev/dsk/c0t6d0, 0

  BOOT:  lvol1
        When '/stand' is on a separate logical volume.
        If your not sure check /etc/fstab for a seperate lvol for /stand.

  ROOT: lvol3
        When /stand is a seperate lvol then root is usually on lvol3.

  ROOT: lvol1
        When /stand is NOT a seperate lvol then root is on lvol1.

  Even though the BDRA looks correct it may still be corrupt, so you
  should rewrite the BDRA. The commands below will rewrite a typical
  BDRA.

    "lvrmboot -r /dev/vg00"          Removes the info from the BDRA
    "lvlnboot -b /dev/vg00/lvol1"    Writes the boot info to the BDRA
    "lvlnboot -r /dev/vg00/lvol3"    Writes the root info to the BDRA
    "lvlnboot -s /dev/vg00/lvol2"    Writes the swap info to the BDRA
    "lvlnboot -d /dev/vg00/lvol2"    Writes the dump info to the BDRA
    "lvlnboot -R /dev/vg00"          Updates the BDRA
    "lvlnboot -v /dev/vg00"          Displays the BDRA

2.2 How to fix a Root filesystem

 Depending on the corruption of the root filesystem you might have to
 boot the support media and fsck the filesystem. The fsck may fix the
 corruption or indicate that there is a bad block. If fsck cannot fix
 the root or stand filesystem then the HPUX will have to be reinstalled.


 You will need to boot the "SUPPORT MEDIA" and fsck your boot disk.

 If you need specific help please refer to:
    How do I boot my system from Support Media on 10.X ?
    How do I boot my system from Support Media on 11.00 ?

2.3 How to correct a Autoboot file

Rewrite the the boot LIF \(autoboot file\).
Sometimes the AUTO string looks correct but it is corrupt and it should
be rewritten with the full information.

In this example the boot disk is at hardware address 10/0.6.0 which is
equall to /dev/dsk/c0t6d0.

enter "mkboot -a "hpux \(10/0.6.0;0\)/stand/vmunix" /dev/rdsk/c0t6d0"

To verify the AUTO string:
enter "lifcp /dev/rdsk/c0t6d0:AUTO -"

You should see: "hpux \(10/0.6.0;0\)/stand/vmunix"

2.4 How to write a missing LIF

 Refer to:
 How do I fix the boot message IPL error bad LIF magic at 10.X?
 How do I fix the boot message IPL error bad LIF magic at 11.X?

2.5 How to replace a bad /stand/vmunix
If you were able to boot /stand/vmunix.prev then you should restore
/stand/vmunix from backup. If a backup is not available then copy
/stand/vmunix.prev to /stand/vmunix and make a new kernel.

 If you were not able to boot the system disks then you will need to
 boot off the Support Media. From the SUPPORT media you should recover
 your /stand/vmunix from backup and reboot. The other choice would be
 to load a generic kernel from the SUPPORT media, reboot and then
 remake a kernel with SAM.

Refer to:
How do I boot my system from Support Media on 10.X ?
How do I boot my system from Support Media on 11.00 ?

2.6 How to replace a bad /etc/inittab
Recover /etc/inittab from backup or copy the default file from
/usr/newconfig/etc/inittab and edit the file for your system.

2.7 How to replace a bad /sbin/init
Recover /etc/init from backup and verify the permissions are
555 with bin/bin as user/group. Also /etc/init should be
symbolically linked to /sbin/init.

-----

I cant do more for you at the moment... keep us informed and good luck

i'm able to boot using hpux -lm.
however, trying to activate my volume group results in error:
warning: can't attach to phys volumn /dev/dsk/c0t6d0
cross-device link
vgchange: warning: can't query phys volum /dev/dsk/c0t6d0
specified path doesn't correspond to phys volume attached to this volume group
couldn't activate /dev/vg00

does this mean i have a bad disk?
is there some other way to verify?

Andrew

note: there is no vmunix.prev file

The question is more : was vg00 mirrored in which case, yes a disk may be dead somewhere but only you can tell us if its the primary or alternate. But it can still be missing or corrupted LIF
If you were with mirrored disks, try to boot with the no quorum option:
hpux -lm -lq

Looks like a disk failure.
found recovery tape and tried to boot from.
received error at point creating lvm physical volume
'pv create: writing lvm record: I/O error'.
'ERROR: Command '/sbin/pvcreate -f -B /dev/rdsk/c0t6d0' failed'

So you were not with mirrored disks?

correct-
system's hpux 10.20...12 years old...non-supported.

Well you could use external disks for vg00, if you have spare scsi cards...
Here is a k360 where I boot from a small HDS 5750 (no mirrors also but alternate paths on FW-SCSCI):
aco $ vgdisplay -v vg00
--- Volume groups ---
VG Name /dev/vg00
VG Write Access read/write
VG Status available
Max LV 255
Cur LV 12
Open LV 12
Max PV 16
Cur PV 1
Act PV 1
Max PE per PV 5000
VGDA 2
PE Size (Mbytes) 4
Total PE 2842
Alloc PE 1892
Free PE 950
Total PVG 0
Total Spare PVs 0
Total Spare PVs in use 0

--- Logical volumes ---
LV Name /dev/vg00/lvol1
LV Status available/syncd
LV Size (Mbytes) 100
Current LE 25
Allocated PE 25
Used PV 1

LV Name /dev/vg00/lvol2
LV Status available/syncd
LV Size (Mbytes) 1024
Current LE 256
Allocated PE 256
Used PV 1

LV Name /dev/vg00/lvol3
LV Status available/syncd
LV Size (Mbytes) 100
Current LE 25
Allocated PE 25
Used PV 1

LV Name /dev/vg00/lvol4
LV Status available/syncd
LV Size (Mbytes) 300
Current LE 75
Allocated PE 75
Used PV 1

LV Name /dev/vg00/lvol5
LV Status available/syncd
LV Size (Mbytes) 252
Current LE 63
Allocated PE 63
Used PV 1

LV Name /dev/vg00/lvol6
LV Status available/syncd
LV Size (Mbytes) 800
Current LE 200
Allocated PE 200
Used PV 1

LV Name /dev/vg00/lvol7
LV Status available/syncd
LV Size (Mbytes) 712
Current LE 178
Allocated PE 178
Used PV 1

LV Name /dev/vg00/lvol8
LV Status available/syncd
LV Size (Mbytes) 932
Current LE 233
Allocated PE 233
Used PV 1

LV Name /dev/vg00/lvol9
LV Status available/syncd
LV Size (Mbytes) 2048
Current LE 512
Allocated PE 512
Used PV 1

LV Name /dev/vg00/lvol10
LV Status available/syncd
LV Size (Mbytes) 700
Current LE 175
Allocated PE 175
Used PV 1

LV Name /dev/vg00/lvol11
LV Status available/syncd
LV Size (Mbytes) 400
Current LE 100
Allocated PE 100
Used PV 1

LV Name /dev/vg00/lvol12
LV Status available/syncd
LV Size (Mbytes) 200
Current LE 50
Allocated PE 50
Used PV 1

--- Physical volumes ---
PV Name /dev/dsk/c1t0d0
PV Name /dev/dsk/c0t0d0 Alternate Link
PV Status available
Total PE 2842
Free PE 950

aco $
When I did this in 98 it was because the system booted 2x faster than on the K internal disks (still in the box unused...)