DiskSuite: Breaking mirrors.

BOFH · December 23, 2005, 1:58pm

Ok, so I have a remote system (7 states away) that's using SDS to manage the two 18 gig disks. /, swap, /var, /home, and /opt.

The mirroring procedure I created uses installboot to ensure there's a bootblk on both disks of an SDS mirror.

The system has a problem booting (can't write to /var/adm/utmp) and there's no bootable CD on site. I have a "hands & eyes" person who's not familiar with Solaris. My intention was to break the mirror, boot to one disk, fsck the second disk and boot to it to recover. Remirror the system after it's back up.

OBP has boot-device=disk net
c0t0d0 and c0t2d0 are the two disks.
c0t2d0 is identified as the left side of the mirror where normally t0 is left and t2 is right.

Remember, I'm not there so all the commands are being entered by the H&E guy.

Enter root's password to go to single-user mode.
fsck all the slices on t0 and t2 except t0's /var slice since it's mounted in ro mode.
mount c0t2d0s0 /mnt
Remove the MDD stuff from /mnt/etc/system
Change the mounts in /mnt/etc/vfstab
eeprom to set boot-device=disk1
umount /mnt (ensures everything's written to disk).
to be sure, installboot bootblk /dev/rdsk/c0t0d0s0 and /dev/rdsk/c0t2d0s0

Upon boot, the system said that c0t0d0s0 was not of this fstype and we received the same "can't write to /var/adm/utmp" error.

I did some google searching and didn't find anything specific to this issue. I can boot in single user mode and mount the slice without a problem so it's puzzling.

Because of this, I think that "disk" is defined as t2 instead of t0 so bring it to single user and change eeprom boot-device=disk and it generates the exact same error.

Now, aside from the problems (we ultimately left the mirror broken, reinstalled Solaris on one disk and recovered the data from the second disk), does this sound like it should have worked?

One of the results of this for me is to ensure installboot is run on all SDS mirrors and to check the status of boot-device (some systems weren't "disk disk1").

Carl

Just_Ice · December 23, 2005, 3:14pm

same process worked for me with the older version of disksuite on solaris 2.5.1 but definitely went bonkers with the newer version on solaris 8 ... your process would've worked if you reformatted the metadb slice on 1 of the drives prior to rebooting ... the system enables disksuite on bootup and sees intact metadbs so it tries to configure the filesystems under disksuite control like normal ... removing the metadbs ensures that the system doesn't have disksuite running ... and everything related to disksuite in /etc/system --- from "Begin MDD" to "End MDD" needed to get removed and not just commented out ...

the fsck of the individual filesystems while they were still mirrored, however, did not sound too good --- i think they should have been done after the mirrors were broken and the box rebooted ...

if you didn't know this yet ...
you don't need to go into single user mode to reset the eeprom entries if you don't want to (see "man eeprom") ... and you could also set them from the ok prompt as required (see this )

Perderabo · December 23, 2005, 4:28pm

Well, I must be not focusing on something in your post. OK, /var is screwed up somehow. So you break the mirror, do nothing to repair /var, and boot from one side of the mirror. And then.. 'we received the same "can't write to /var/adm/utmp" error.' Was that not the expected result?

You say 'fsck all the slices on t0 and t2 except t0's /var slice since it's mounted in ro mode.' How did it come to pass that /var was mounted in ro mode? Even if you couldn't get it unmounted for some odd reason, fsck -n should have been possible. Repairing /var seems like the key to recovery. What's wrong with boot into single user mode, square away /var, and reboot?

BOFH · December 26, 2005, 2:31pm

Well, that might explain it since the last time I had to do this was on a 2.5.1 Solaris system.

I hadn't thought that formatting the metadb slice would have mattered. I removed the MDD entries from /etc/system and removed /etc/system all together on further boots just in case.

Yea, that could have been a problem, however we did boot it several times over a few days and fscked the systems a few times so disk suite should have been off the system pretty quickly.

Yep, knew that. Thanks though.

Carl

BOFH · December 26, 2005, 2:59pm

Well, I didn't "do nothing". I attempted to fsck /var but it responded that it couldn't since /var was already mounted as read-only. I don't know why it was mounted read only. I can only assume (which is why I'm asking here) that due to the initial problem, it couldn't remount read-write.

I thought the initial process of mounting disks, mounted root and var in read-only, fscked them, then remounted them to read-write before mounting the rest of the slices.

Correction would be appreciated of course.

The initial problem was that the system wasn't able to create utmp, possibly because /var was not able to be remounted read-write (again, assuming my comment above is true).

So my steps were to break the mirror so that I had two separate disks, reboot it to just a single, non disk suite controlled disk, fsck the other disk and bring the mirror back.

I cleared /etc/system, changed the md entries in /etc/vfstab back to mounting the disk rather than mounting the metadisks, fixed eeprom so that it booted from the second disk and booted the system.

Since it should be booting from a clean, non SDS controlled disk, it should have booted successfully. I got puzzled when I received the exact same error.

I don't know. See my theory above. Since I don't have a Sun contract here at work (AIX/Red Hat shop), I can't check some of the deeper knowledge available within Sunsolve that I had available when I was working at a Sun shop.

I don't know how -n would have worked (reply 'n' to all prompts) and it wouldn't have occurred to me to try it. Can you explain further how it might have helped?

Well yea That's what I was trying to do. I thought, perhaps in error, that getting it mounted without disk suite would let me fsck the other disk and then boot to repaired disk to get the system back up. I could then re mirror the disk afterwards.

Hence the questions. Thanks for taking the time though.

Carl

Perderabo · December 26, 2005, 5:35pm

I'm not sure what version of Solaris you're using, but I thought that when I boot into single user mode, only / and /usr was mounted. Even if /var was mounted, I would think that a "umount /var" would take care of that. I'm not sure if stuff is mirrored in single user mode. But I would resist breaking a mirror if I didn't need to. And I don't see how that will help here. Actually, I now think that you simply had a typo in /etc/vfstab. Someone had changed /var to "ro". Later, a reboot caused your problem to occur.

As for "fsck -n", it simply provides information and I try to gather information when I don't understand something. "fsck -n" might result in anything from "all looks cool" to "file system? what file system?". Knowing the state of /var would be a help. If I'm right about the typo in /etc/fstab, "fsck -n" will not find a problem. That would lead me to stop looking at /var. Add that to the odd read-only status of /var and my next step would be to check vfstab.

I also like to run "fsck -n" to see how bad stuff is before I run a plain "fsck". I have deeply regretted not doing that on several occasions.

BOFH · December 26, 2005, 8:29pm

Solaris 8.

Nope, umount /var gave me a "mount point busy" type of message.

Well I can check again, however we're using explorer to get a weekly dump of the system. I checked out the copied system and vfstab just to make sure there wasn't a problem, however I wasn't looking for that in particular so I'll check again when I get in tomorrow.

I just use fsck since it'll ask for each item. Then I can review them as they come up. If you autoanswer 'y' or 'n', you won't be able to evaluate the problems as they come up.

I appreciate the thought on vfstab. I have found an error in another server's vfstab so there's a chance that was it. I'll check the explorer output.

I've spent a lot of time these past several months discovering problems, making repairs and whipping up scripts so they won't happen again so it wouldn't surprise me.

Thanks.

Carl

Perderabo · December 26, 2005, 9:04pm

That's not the end of the road. Use fuser to find out why. lsof would be better, but it probably won't be there in single user mode.

That policy is fine for the first few hundred questions. Then what? How many more questions will there be? A 1,000 more? 10,000 more? Do you abort an fsck that might have had one more question to ask? Is it safe to abort fsck and restart it with -y? A few sessions like this and that "fsck -n" starts to look pretty good. "fsck -n" will only take a few seconds if everythinkg is ok. And it can always be safely aborted since it doesn't change anything. But if you don't like "fsck -n", my other trick is jamming the "y" key down with a paper wad.

BOFH · December 26, 2005, 10:46pm

Isn't wtmpx the most likely issue here? If the system needs wtmp and it's holding on so I can't umount, how is knowing that going to help? And it might not be wtmpx, it could be that I should have been able to umount /var and didn't realize that. When I couldn't umount it or remount it rw (which I also tried), I figured it wasn't possible and moved on to breaking mirrors.

Can you really force a umount? Something I'll have to investigate.

But I will keep fuser in mind for next time

You can use fsck -y

Seriously though, if there are 1000 questions, what are you going to do by answering 'n' to all of them except know that there are a bleeding lot of them? Especially when there's some guy on the other side of the phone who has absolutely no idea what's scrolling off the console or even worse, if I'm tipping in and can't interrupt. Do I go away for coffee then and come back in 2 hours?

I figure that after 20 or so 'y's, there might be something even more serious going on and I may have to consider newfs'ing the slice and copying over the apparently good slice, or hoping there's a good backup somewhere.

And another couple of data points that I didn't think important for the initial question:

1) The keyboard wasn't a Sun keyboard. It was, I believe, an MVS type keyboard. Two rows of function keys along the top as near as I could understand. Chris couldn't find a break key and the control key was labeled something else. We had to use Ctrl+[ to generate the necessary escape sequence in vi since there wasn't an escape key either.

2) Also, the console connection was somewhat flaky. Kept throwing odd characters on the screen from time to time and locked the console up so that it had to be power cycled to recover.

Really though. In spite of all the other questions and suggestions, the real problem was, why didn't breaking the mirror succeed? Why couldn't I disassemble the mirror and bring the system back up on a single disk in the manner I described?

Right now, mirroring is one of this company's backup schemes so this is important.

Tivoli for the paying customers.
It's interesting that user crontabs continue to function after a user's password expires on AIX and Red Hat, but stops processing on Solaris. This was a cause of backup failures on our Sun boxes. (This is the responsibility of another department so we don't get whacked.)
Mirrors to ensure system availability in the event of a failed disk.
Unless of course, we're unable to boot to the "good" disk for some reason.
flarchive for a jumpstart recovery if the system can't be mirrored.

Currently 2 and 3 are mutually exclusive. If the system isn't mirrored, we use flarchive. Not enough disk for flarchives of all the systems.

So is there a step I missed? Something else I should have tried? Granted, there might have been a simple problem in vfstab. I'll check that and check it on the other systems. I'll even post the vfstab here if it'll help. It sounds like I should have been able to umount /var. If true, I'll try harder to do that when I purposly break one of the lab boxes as a test.

Carl

Perderabo · December 26, 2005, 11:52pm

There are very few files that "the system" needs. Most files are opened by processes. wtmp is not held open by any properly function process. You still miss my point about fsck and I will just live with that. But I will point out that interrupting a process via tip is no problem. You just need to use stty to set your interrupt character to something usable.

reborg · December 27, 2005, 8:03am

Not very likely actually as Perderabo pointed out.

Yes

Good idea

Pretty much, you let it finish, then look in lost+found which if you're lucky will contain nothing more than a few log file segments.

And another couple of data points that I didn't think important for the initial question:

Without knowing exactly what you did it's really very difficult to say, I have used the procedure many many times without problems.

There is no reason at all why 2 and 3 above should be exclusive, solaris supports any combination of the above during installation, granted that in Solaris 8 you do need to be a little creative with a finish script, in Solaris 9 there are jumpstart keywords which make it really easy. However all these choices can all be calculated using custom probes during installation. Something that I have been doing for several years with great success.

BOFH · December 27, 2005, 1:13pm

Sorry. I wasn't clear. It's the policy here based on not having sufficient disk on the jumpstart server to hold all the images for all the servers.

And for completeness, here's the system vfstab from the explorer dump:

$ ls -la vfstab
-rw-------   1 carls    aixteam         453 Oct 29 2004  vfstab
$ more vfstab
#device         device          mount           FS      fsck    mount   mount
#to mount       to fsck         point           type    pass    at boot options
#
#/dev/dsk/c1d0s2 /dev/rdsk/c1d0s2 /usr          ufs     1       yes     -
fd      -       /dev/fd fd      -       no      -
/proc   -       /proc   proc    -       no      -
/dev/md/dsk/d32 -       -       swap    -       no      -
/dev/md/dsk/d30 /dev/md/rdsk/d30        /       ufs     1       no      -
/dev/md/dsk/d31 /dev/md/rdsk/d31        /var    ufs     1       no      -
/dev/md/dsk/d33 /dev/md/rdsk/d33        /home   ufs     2       yes     -
/dev/md/dsk/d34 /dev/md/rdsk/d34        /opt    ufs     2       yes     -
swap    -       /tmp    tmpfs   -       yes     -

Carl

BOFH · December 27, 2005, 1:21pm

But if fsck -n doesn't do anything, there won't be anything in lost+found.

I've used it several times without problems until this one which is why the questions. This is the first time I've had to try and talk someone through it who's not at least moderately familiar with Solaris.

I guess I'm fortunate that I haven't been in this position before.

Carl

BOFH · December 27, 2005, 1:59pm

So the basic concept here is that I should have been able to umount /var. Ok, I'll file that away. I'll be bringing up a lab box so I can test these out and be ready for next time.

I think I understand. You're using it as a troubleshooting step to see if there's a problem and just how bad the problem is. I'm not saying I won't do it in the future.

Well, ctrl+c interrupted the session to the point where we couldn't reestablish it. It seems the break sequence should be something different than the break sequence on the host system based on that response.

The other question would be how to restore the ability to reconnect. Is there a "hang-up" type of command or an open file somewhere?

Thanks for the responses though. I do want to learn how to be better at this. It does sound like I have to circle the concept a few times in order to fully understand it though. Sorry if I seem a little slow at attaining comprehension.

Carl

Just_Ice · December 28, 2005, 11:29am

i doubt that /var was ever mounted ro as that would mean that the system would have complained way lot earlier as the system does write to /var/adm/messages fairly regularly even if it doesn't need to write to /var/adm/wtmpx ... i'm convinced that the filesystem had some corruption that gave it a problem booting up ... btw, what was the last thing that happened to the server prior to it having the boot problem? was it patched and then rebooted? if patched, was the server in single-user mode or without regular user logins/processes at all during the patching?

BOFH · December 28, 2005, 12:57pm

The read only response was after the system came up in maintenance mode.

Absolutely. I didn't start the troubleshooting on the system. It was handed over to me after the first guy had to go home. It was at the maintenance prompt when I got it. They were unable to Ctrl-Brk to the open boot prom since Chris (the H&E guy) couldn't find the ctrl or brk keys. I picked it up there.

The error I had was along the lines of (it's been two weeks) "can't create utmp".

Attempting to umount /var returned the mount point busy message. Chris hit enter on fsck before I could tell him to type in the specific file system I wanted to check. When it got to /var, it responded that /var was mounted read only.

According to the team lead, an application (a cisco management package) was being upgraded which required a reboot after it was done.

Carl

BOFH · December 28, 2005, 1:02pm

While I haven't seen this on Solaris, I have seen /var remounted as read-only on linux boxes. Happened to two of our systems in the past couple of months and I have one in the state right now. Apparently it's a symptom of a disk getting ready to go south.

Linux$ sudo ls -la
Password:
sudo: Can't open /var/run/sudo/carls/0: Read-only file system
collect: Cannot write ./dfjBSI0WES028034 (bfcommit, uid=51, gid=51): Read-only file system
queueup: cannot create queue file ./qfjBSI0WES028034, euid=51: Read-only file system

But that's linux.

Carl

reborg · December 28, 2005, 6:22pm

Actually, that's not quite right. A system where var (or even root if var is on root disk) is ro then the first complaint will be that message indicated by BOHF, if you have a spare test box try it and see. This somtimes does happen even after an fsck has completed and the command

remount -o rw,remount /var

is required to remount it read-write, i think BOHF has already said that he tried this, it might appear that a reboot should fix this but it is not always the case. I have seen this message many times but it has never that I can recall been the precursor to a disk failure nor even left the system unrecoverable.

dangral · December 29, 2005, 4:57pm

Whats the problem with running an fuser to see what process is holding onto the /var filesystem?

Just_Ice · December 30, 2005, 1:31am

i didn't say anything to the contrary ... all i said was it is highly unlikely that the /var filesystem was ever intentionally mounted ro --- as in somebody edited /etc/vfstab then rebooted the box ...

i agree that the ro filesystem error in itself does not necessarily mean that the drive is going to undergo disk failure or that the system is unrecoverable --- however --- experience tells me that a persistent ro filesystem error even after several successful fsck runs may be a symptom of a corrupted filesystem ... and since disk errors can also contribute to filesystem errors, i would not disqualify a failing disk from the picture that quickly when i start troubleshooting ...

i just googled this and found it interesting how a cisco software application that just sits on the server gets to corrupt the filesystem ...