Clone Solaris system using cpio backup

F.Mattar · November 29, 2022, 4:37pm

I want to celebrate this and I hate to bring more bad news.
But I have to share the complete status to make sure the mission was accomplished!

Firstly, the application that's installed on this machine failed to launch with the startup script and prompted an error on startup.
The whole thing looks like a real clone of the good machine.

I tried to launch the executable of this application. Unfortunately, I got an error.
I will share it with you just in case you could help from your experience with this UFS filesystem. It's my job to troubleshoot this anyway.

F.Mattar · November 29, 2022, 4:55pm

However, it should boot normally without a boot disk command.
Although ,It did boot the next time without -r option

hicksd8 · November 29, 2022, 5:17pm

Does the application have a license tied to a specific hostid? Like the hostid of the good machine? Is it a license complaint do you think? Post the error message and we'll see whether it looks familiar to any of us.

Yes, it should boot normally with just the 'boot' command provided the OBP (boot prom) configuration is set correctly to default to disk. If not, it's no big deal to set it later. Meantime just use 'boot disk0'

The '-r' switch on boot tells it to 'reconfigure' and is typically used after new hardware is fitted or some such that needs to be 'discovered'. It should only be required once (unless further hardware changes are made).

hicksd8 · November 29, 2022, 5:18pm

Game on then!!

hicksd8 · November 29, 2022, 5:35pm

And don't forget that, unless you've changed it, the ip address of that box will be the same as the other box. Could the application be 'seeing' the other box and cause the error? That's the problem with exact cloning.

MadeInGermany · November 29, 2022, 8:06pm

At the Ok prompt you can display the openboot variables and their values with
printenv
and change them with
setenv var newvalue

As root in the booted OS you can display the openboot variables and their values with
eeprom
and change them with
eeprom "var=value"

Look for variables like boot-device

F.Mattar · November 29, 2022, 8:06pm

I will answer from the last to first.
Please Don't worry about the ip address. The good machine is not with me right now. I would never connect them in the same network anyway.

F.Mattar · November 29, 2022, 8:09pm

Yes I'm aware of that. I Don have to use -r anymore. Its booting with just "boot disk"

So, as you stated, it seems to be a matter of configuring OBO to set it to default to disk. Currently if I type boot only it would start searching for a network link endlessly.

F.Mattar · November 29, 2022, 8:13pm

For sure the application has some kind of a licence tied to a specific host id.
I tried to make sure the bad machine host id matches the good one.

Anyway, I have gathered some information and some errors. I didn't want to post them before working on them myself and try to understand the errors.

To begin with, it seems related to /tmp directory which I can't find it.
I still don't have good experience with /tmp directory and how important it is for running applications.

F.Mattar · November 29, 2022, 8:14pm

Thanks!
I will do that tomorrow

F.Mattar · November 29, 2022, 8:15pm

This is the result of a TCL file.
It's a startup script if I can say so.

That is what's in /tmp right now.

F.Mattar · November 29, 2022, 8:22pm

Sorry if it's not clear.
This is the result after I manually launch the application executable from /usr/home/kunde directory.

MadeInGermany · November 29, 2022, 8:29pm

Check permission of /tmp

ls -ld /tmp

Owner must be root, and permission must be 1777 or rwxrwxrwxt

If necessary, change with

chown root /tmp
chmod 1777 /tmp

hicksd8 · November 29, 2022, 8:34pm

That isn't easy unless you know how. It takes a kernel injection to achieve but we know how to do that if necessary. On Sparc the hostid is usually derived from an algorithm performed on the unique MAC address of the network interface. Spoofing it can be done however.

On Sparc the hostid is usually 8 digits of hex.

hicksd8 · November 30, 2022, 11:00am

If you are still having problems with this clone don't forget that we had the mysterious small slices 4 & 5. We now know that slice 4 is mounted on /tmp but we don't know what slice 5 is for. Perhaps it holds some essential license files for example.

The restore of the system wouldn't have written such files back to slices 4 & 5 because they weren't mounted (not that we knew where to mount them at the time).

If you suspect that this might be a problem please post the output of

# cat /etc/vfstab

and

# mount

That will tell us whether slice 5 is a production slice and where it should be mounted, etc. We could then selectively restore the files to it without having to rebuild the whole system.

Only do this if you suspect that it's causing an issue.

F.Mattar · November 30, 2022, 8:34pm

It worked!!!
Application launched.
I even rebooted to check the auto launch with startup TCL script(which used to prompt the errors I posted earlier). And it did launch as well.

F.Mattar · November 30, 2022, 8:41pm

Cat /etc/vfstab

Mount

/opt is mounted on /dev/dsk/c0t0d0s5

Even though the machine seems to be working right now. I have to test it on site.
So, it would help get your opinion about this slice 5.

F.Mattar · November 30, 2022, 8:50pm

Boot device is already set to disk!!
I'm still not able to boot automatically without ok boot disk command

MadeInGermany · November 30, 2022, 9:09pm

Your diag-switch is true so it auto-boots from diag-device that is net.
Change it to false; as root run

eeprom "diag-switch?=false"

(Do not forget the questionmark, it belongs to the variable name.)
Then it won't run diagnostic tests, and it will auto-boot from boot-device that is disk.

hicksd8 · November 30, 2022, 9:41pm

I think that you said that you are new to Solaris but are you also new to Unix/Linux? If so, I'll try to explain what has happened.

When you restored the root filesystem from the cpio archive it would have created both the /opt and /tmp directories on slice 0 (the root filesystem) and restored the files into those directories from the archive.

Now, when you booted the system normally, slices 4 & 5 were mounted onto /tmp and /opt respectively and those restored files disappeared to be replaced by the empty (apart from lost+found) new filesystems on those slices. Therefore, the /tmp and /opt contents will have disappeared until those filesystems (slices) are unmounted. Then those files will reappear. This might be difficult to understand from a non-unix users viewpoint but I'm doing my best to explain it.

So if you unmount those filesystems you should be able to list the files that came from the archive and see what they are. In order to mount or unmount the directory being operated on must not be 'busy'. Therefore you need to 'cd' away from the mountpoint for the operation to work, thus:

# cd /
# umount /tmp
# cd /tmp
# ls

and

# cd /
# umount /opt
# cd /opt
# ls

That will show you what files originally came from the archive.

In terms of a solution, there are two choices.

If the root filesystem is not very full (plenty of spare space) then you could continue to leave the files where they were restored as part of the root filesystem. To do that the two lines in /etc/vfstab that mount slices 4 & 5 when the system boots need to be deleted. Don't forget to be professional and make a copy of vfstab before you try editing. Then you can simply copy it back if it should go wrong. Use your favorite editor, make the changes and save the file.
You could mount slice 4 on a different mountpoint (e.g. /mnt) and then copy the files from /tmp to /mnt/tmp to populate the filesystem on slice 4 with the files from the archive. Then do the same with slice 5 mount on /mnt and copy the files from /opt to /mnt/opt. Once both are done, when you reboot the machine and the boot process mounts those slices on /tmp and /opt, you will have the original archive files visible.

I hope that you understand what I'm trying to get across but, if not, perhaps another member/moderator can explain it differently.