FATAL BOOT ERROR: Can’t load stage 3

I'm under huge pressure to recover data from an old server that only gets booted very infrequently. The data on it is urgently required - isn't always??!
..
Server is an HP DL380G5 which is running SCO Unixware 7.11
..
It has 3 partitions in a RAID 5 Configuration: 1 of 10GB (I assume it boots off this one) and then 1 of 400GB and the last one is about 150GB. All partitions are marked OK.
The error on booting is �FATAL BOOT ERROR: Can't load stage 3�.
It does not have a FD drive only a CDROM drive.
..
I downloaded an ISO CD image from the SCO support site labelled UnixWare-7.1.4-June2008 ISO Image. In the hope that this will give me access to the OS where I could use my scripting skills to check for issues e.g. mount drives, run a fsck etc.
..
And I am able to boot off that but end up in a basic BCP - /stand - (which includes boot and unix files ) ... but from this point I'm not sure how to use it and therefore I'm not sure where to go from there ....
I would be very, very grateful for some quick step-by-step assistance for recovering the system - what to check for etc
Thanks in advance!!!

The contents of /stand should look like. This is a 7.1.4 system using a single SATA disk.

total 10980                                                            
-r--r--r--    1 root     sys            1053 May 11  2004 bfs.blm      
-rw-r--r--    1 root     sys             143 Jun 24 18:38 boot         
-r--r--r--    1 root     sys            1869 May 13  2008 bootmsgs     
-r--r--r--    1 root     sys           10116 May 11  2004 dcmp.blm     
-r--r--r--    1 root     sys            2071 May 11  2004 hd.blm       
-r--r--r--    1 root     sys            3486 May 11  2004 help.txt     
-rw-r--r--    1 root     sys             940 Jun 26 09:45 license      
-rw-r--r--    1 root     sys              22 Jun 26 09:45 license.sp   
-r--r--r--    1 root     sys           11468 May 11  2004 logo.img     
-r--r--r--    1 root     sys            5664 May 11  2004 platform.blm 
-rw-r--r--    1 root     sys           17809 Jun 26 09:52 resmgr       
-rw-r--r--    1 root     root          17617 Jun 26 09:45 resmgr.sav   
-r--r--r--    1 root     sys             913 May 11  2004 smallfs.blm  
-r--r--r--    1 root     sys           10900 May 11  2004 stage3.blm   
-rwxr--r--    1 root     root        2766800 Jun 24 19:20 unix         
-rwxr--r--    1 root     sys         2766780 Jun 24 19:10 unix.old     
#                                                                      

I am surprised that you were able to mount the RAID system as the list of HBA's on the original cd is quite short, or did you have to load it from a separate CD.

1 Like

I think I have confused you with the way my original post was worded;
In my post, when I said "I am able to boot off that <meaning the CD I downloaded> but end up in a basic BCP - /stand - (which includes boot and unix files)" what I meant was that I end up on the CD itself with its BCP (/stand). The basic one you listed). Nothing is mounted at this point. All I have is a [boot] prompt.
So now I am unsure how to boot to a basic version of Unix and then from there how to (1) mount the RAID system and then (2) from there proceed with troubleshooting and (3) recovery.
..
It may be that the best way forward is for me to run an UPGRADE process on the system (it is SUPPOSEDLY on version 7.1.1) to upgrade it to 7.1.4.
I have downloaded <UW714+_ISL_1.0.0Ds.iso> which I could use for this purpose.
Please advise?

714+ is designed for use with VMware, not as a standalone system. Use the 714 cut 2 cd.
The Emergency boot Cd is specific to an installation.
Installing 714 on the system may result in all the file systems being formatted.
I suggest you do the following, which is still high risk.
Download the supplementary HBA cd. I am assuming that the HPSAS driver is required.
Make a note of, or save the current RAID configuration.
Remove all the drives from the system, and carefully mark which slot they came from.
Acquire a single disk, install it, and create a new RAID0 configuration.
Install 7.1.4, and in this process create the same file system names as on the original. The actual size of the file systems does not matter.
Create an emergency boot cd, or download and install Microlite Edge (microlite.com) and create their recovery CD.
Remove the single hard drive and replace with the RAID5 configuration.
Boot from the newly made recovery cd, and run fsck on the root file system and the stand file system.

1 Like

Many thanks for your great advice! I truly appreciate it.
..
One thing I would value your response to the following potential question which I may be asked:

  • The process of upgrading from 7.1.1 to 7.1.4 is well documented
  • Many companies across the world would have followed this process
  • And presumably their data file systems did not get over-written or recreated (albeit they probably had a backup!)
  • Why do you think that this is such high-risk in this case?
    Not that I doubt your advice but just that I will most probably get asked this question by my client when I ask them to acquire a new disk

I just read the Unixware 714 getting started guide.
An upgrade install is possible. But, it installs using pkgadd. You must be able to boot the 711 system in order to do the upgrade. This procedure is different to the Openserver 5 upgrade installation procedure, where you boot from the newer version CD and select the upgrade install.
For the fresh install (of 714), there does not appear to be an option to preserve non root file systems that are on disk 0. So if your RAID5 system is configured as a single logical drive with multiple partitions, you do not have secondary drives.
The RAID controller supports both SAS and SATA drives, although not mixed. I have not checked to see if a standard 2.5in SATA drive will fit in the carrier, or whether you have to purchase an HP drive.
http://sco.com/support/docs/unixware/uw714en/getstart/getstart.pdf

I am so grateful for all your efforts and for your extremely valuable advice. Mostly I am blown away that you used your precious time to actually read the getting started guide just to assist me. THANK YOU jgt! I salute you Sir.

Hi jgt, I have done above as suggested so now I have the RAID5 system booted with an emergency boot cd...
But I am not sure how to proceed.
I have tried to fsck the root filesystem but there isn't a vfstab on the system so I get an error.
I have tried running an fsck (the VXFS version) on all the devices in /dev/rdsk but that just resulted in an error from most of them.
So I am now unsure what steps to take to troubleshoot the original issue?
I can recreate the vfstab to be the same as the vanilla install I did on a single disk from which I created the emergency cd. Would that help??

Does this help
I cannot run fsck or vi from UnixWare 7 emergency floppy set.

However, you may find that Microlite Edge, is easier to use, as its recovery process is all menu driven. While it is a commercial product, it does have a 60 day full feature trial.

1 Like

Sadly not. When I run umountall I get "Invalid argument"
When I run "fsck -F vxfs /dev/rdsk/c0b0t0d0s1" I get
invalid super-block
UX:vxfs fsck: ERROR: cannot initialize aggregate file system check failure,aborting ..."

---------- Post updated at 07:14 PM ---------- Previous update was at 07:07 PM ----------

Microlite's response was "Unfortunately, not having a current BackupEDGE backup RecoverEDGE will be of no use to you. Being that you're in the position you're in with a system that cannot boot, my advice would be to contact a data restoration company" :frowning:

Install Microlite Edge on the RAID0 system.
Create the recovery CD.
Boot from the recovery CD to confirm that it works.
Re-install the RAID5.
Boot from the recovery CD.
What if?
Assuming that the data that you want is not in the root file system, create a backup of the RAID0 system, boot from the recovery CD and restore the root and stand file systems to the RAID5 disks. Then you should be able to boot 7.1.4 from the RAID5 system, and add the mount points and mount the non-root file systems.

Hmmmm! May be worth a try. Thanks! I'll suggest it to client (who are now sick of me).
But not tonight ... Enough trouble for one day! Thanks again. I'll let you know if it works.

I have never had a problem creating an Edge Recovery CD. However, in the past I have only installed it on Openserver. I will try this evening to duplicate your problem. Failing that, it is a long weekend in the US, so there will be no one at Microlite till Tuesday.
It is possible that the HBA driver was copied into the kernel during installation, but the library was not saved, and is not available to Edge to create its kernel.

Issue is:
I'm in process of attempting to get Microlite RecoverEdge create a bootable recovery CD ISO.
It is a case of 3 steps forward and 2 steps backwards! Solve one error and then hit another! But I'm finally nearly there. Just one more issue to conquer.
Namely that as it is creating the ISO I get an error "Error Enabling Statics" (not statistics).
..
The Online Help says of this error "One of the libraries could not be found. Make sure the program whose library dependency caused the error can be excuted normally on the system. this could occur if a program's required libraries do not exist. To see what those libraries are run ldd(1) giving the violating file as an argument. This could also mean the required library was not in the environment variable 'LD LIBRARY PATH'or that it was not in /etc/ld.so.conf (see ldconfig(8)).

Having fiddled and pfaffed with LD_LIBRARY_PATH setting it to all the libraries mentioned when I ran the command "ldd edge" and getting nowhere, I found a way to debug edgemenu as follows:
..

  1. export EDGE_DEBUG=enable
  2. Start Edge with edgemenu 2>/tmp/edgedebug
    ..
    Produces a nice debug file which I've uploaded and attached ... but have yet to decipher. Perhaps you may have some idea?

The issue appears to have been a missing "dac" module from /etc/conf/sdevice.d when RecoverEdge was trying to build its kernel image. The dac module relates to the implementation of ACL.
I merely copied one of the other module files (pci) in /etc/conf/sdevice.d, edited and changed all references from "pci" to "dac" and then changed the N to a Y in the file. After which the new recovery kernel and image was successfully created ...
Now to see whether we can use it to recover the system!

Are you going to try 'fsck'ing the RAID5 root file system prior to restoring the 7.1.4 root file system?

---------- Post updated at 03:15 PM ---------- Previous update was at 03:03 PM ----------

Yes indeed. That is what I have done.
Unfortunately /stand cannot be fsck'd and cannot be mounted.
Microlite says it cannot initialise it either.
My thought is to recreate the whole HD - like mkdev hd?
But would value your thoughts on this route and if possible a basic plan I should follow.
..
Microlite did "see" the 2 other data disks as c0b0t1d0s0 and c0b0t2d0s0 but said these were unknown entities as far as it is concerned.

If you look at the raid5 configuration, are there three logical disks or one?

There are three logical disks. It would seem to me that the best way forward would be to try to find the disk layout (partitions/slices and filesystem types) information of the 2 data/backup disks (for which we know the raw device names - c0b0t1d0 and c0b0t2d0.
Question is - what is the best way to get this info.
Once we have this and presuming the partitions/filesystems are intact, we could mount them and recover the data on them.
Your thoughts and ideas on this - as always - much appreciated