Unable to extend file system

Hi,

I have inherited this AIX 5.3 host. I'm unable to increase the /usr file system. It gives me this error.

What needs to be done to remove this error?

vios:/home/padmin$ chfs -a size=+128M /usr
0516-304 lquerypv: Unable to find device id 0002ef4df616f9690000000000000000 in the Device
        Configuration Database.
0516-788 extendlv: Unable to extend logical volume.
vios:/home/padmin$
vios:/home/padmin$
vios:/home/padmin$ lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
hd5                 boot       1     2     2    closed/syncd  N/A
hd6                 paging     4     8     2    open/syncd    N/A
paging00            paging     8     16    2    open/syncd    N/A
hd8                 jfs2log    1     2     2    open/syncd    N/A
hd4                 jfs2       2     4     2    open/syncd    /
hd2                 jfs2       13    26    2    open/syncd    /usr
hd9var              jfs2       5     10    2    open/syncd    /var
hd3                 jfs2       9     18    2    open/syncd    /tmp
hd1                 jfs2       31    62    2    open/syncd    /home
hd10opt             jfs2       2     4     2    open/syncd    /opt
lg_dumplv           jfs        8     16    2    closed/stale  N/A
fwdump              jfs2       1     2     2    open/syncd    /var/adm/ras/platform
dumplv              sysdump    7     7     1    open/syncd    N/A
vios:/home/padmin$ lspv
hdisk0          0002ef4d0206744e                    rootvg          active
hdisk2          0002ef4df6770012                    clientpool      active
hdisk4          0002ef4d25f0f1d8                    nfsvg           active
hdisk3          0002ef4d02a2110a                    rootvg          active
vios:/home/padmin$

Hi,

Not that up to speed with the VIOS server stuff, but padmin is a restricted shell - shouldn't you be root?

Gull04

You do appear to be on the VIO server. Try looking at the output of help to find what commands they have provided. You can become the root user, but I presume that it is very easy to do bad things to multiple OSs that you would use for your business applications, hence why IBM shield you from it on a VIO server.

When you say you have AIX 5.3, I presume you mean that the business applications are running on that.

If this really is your business application and we're just getting confused by the prompt, can you show us the output from lslv -l hd2 and lslv -m hd2

Kind regards,
Robin

vios:/home/padmin$ lslv -l hd2
hd2:/usr
PV                COPIES        IN BAND       DISTRIBUTION
hdisk0            013:000:000   100%          000:000:013:000:000
hdisk3            013:000:000   100%          000:000:013:000:000
vios:/home/padmin$
vios:/home/padmin$
vios:/home/padmin$ lslv -m hd2
hd2:/usr
LP    PP1  PV1               PP2  PV2               PP3  PV3
0001  0260 hdisk0            0231 hdisk3
0002  0261 hdisk0            0232 hdisk3
0003  0262 hdisk0            0233 hdisk3
0004  0263 hdisk0            0234 hdisk3
0005  0264 hdisk0            0235 hdisk3
0006  0265 hdisk0            0236 hdisk3
0007  0266 hdisk0            0237 hdisk3
0008  0267 hdisk0            0238 hdisk3
0009  0268 hdisk0            0239 hdisk3
0010  0269 hdisk0            0240 hdisk3
0011  0270 hdisk0            0241 hdisk3
0012  0271 hdisk0            0242 hdisk3
0013  0272 hdisk0            0243 hdisk3
vios:/home/padmin$

Do you get anything from these:-

help                   # VIOS shell would list the available commands
uname -a               # Checking the OS version and other labels to see if we can spot what this really is
ps -f                  # Just to see what shell you are really

Can you easily become root on this server? How, or why not?

Does this article help at all? VIOS COMMAND LINE: UNDER THE COVERS (AIX Down Under)

I'm still concerned that your business application might be running on the VIOS. It kind of can, but it's really not designed for it. Do you not have virtual guests for you business application? Are you sure you have the right /usr that needs to be extended? On a VIOS, ordinarily there would never be a need to extend it.

If you business application really is running on the VIOS, then an upgrade to the VIOS might destroy it. The VIOS only care about making sure the VIOS is working properly and sharing devices to the guests where the real business applications should run.

Robin

vios:/home/padmin$ help

If available, you can refer to the Base Document Library
for general assistance.

Some basic Commands are:

    man -k keyword      - lists commands relevant to a keyword
    man command         - prints out the manual pages for a command
    cat                 - concatenates files (and just prints them out)
    vi                  - text editor
    ls                  - lists contents of directory
    mail                - sends and receives mail
    passwd              - changes login password
    sccshelp            - views information on the Source Code Control System
    smit                - system management interface tool
    tset                - sets terminal modes
    who                 - who is on the system
    write               - writes to another user

To find programs about mail, use the command:           man -k mail
and print out the man command documentation via:        man mail
You can log out of the system by typing:                exit
vios:/home/padmin$ lsvg -p rootvg
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
rootvg:
PV ID             PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
0002ef4df616f969  removed           546         538         110..101..109..109..109
hdisk0            active            546         454         109..89..38..109..109
hdisk3            active            546         469         109..104..38..109..109
vios:/home/padmin$

---------- Post updated at 07:17 PM ---------- Previous update was at 07:13 PM ----------

I have mentioned AIX 5.3 because its the information I got from this host using oslevel command.

I see lsvg -p rootvg command also list the device 0002ef4df616f969 as removed.

To become root , I used the command oem_setup_env

vios:/home/padmin$ whoami
root
vios:/home/padmin$

So, yes, this is the VIO server, not where the business applications run, so that is good to know. Can I presume that there are a pair of VIO servers in play here so that the disk for the guests can have two paths to each disk, or at least each guest would have two allocations of disk that it can mirror?

Can you tell us more about this server configuration? I'm a little concerned that you might have an exposure to hardware failure.

To clean up the removed disk, see if this helps - IBM Resolving "missing" or "removed" disks in AIX LVM - United States It took a long time for me to load the page.

Also, have a read of this for some good background on VIO servers - The VIO cheat sheet

I hope that these help.

I would, however, worry that you might be trying to grow the wrong filesystem What makes you think that the VIO server needs to have /usr increased? Surely this is more likely for the guest that you have your business application running in. Is there something filling /usr and if so, which one? (VIOS or the business application LPAR) If it's the business application guest, then leave /usr for the VIO server alone. It's not really yours.

Kind regards,
Robin

Sorry for being late, had a few busy days.

Yes, this is a VIOS and you should be root to do changes on a filesystem.

The problem you are facing is - as rbatte1 has already stated - that one of the (physical?) disks the rootvg consists of seems to be missing.

The LVM in short: VGs can consist of one or several "PV"s (physical volumes - disks) and each PV is cut up into "PP"s (physical partitions) of the same size (usually somewhere between 2MB and 2GB). Every logical volume (LV, basically raw disk space which can be formatted into a FS but also be used as paging space or whatever) consists of one or more such PPs mapped to "LP"s (logical partitions). Mirroring, like what you are using, is done by assigning 2 PPs to every LP.

The first thing you need to do is to get rid of the missing disk. If it is possible the best way to do so is to reconnect the removed disk, then remove it from the VG (the command is "reducevg", once it is empty - which is not the case right now). Only then remove it physically again. I suppose this is not possible and the physical disk is long gone. Therefore:

You need to first determine what has been on the 8 used PPs. If you are lucky it was only the copy of the dump-device which are shown as missing in the listing you provided. Try a rmlvcopy command to do so (see the man page for details). If this works and the missing disk is empty continue to remove the missing disk from the VG.

If this doesn't work: report back. It is possible to get the LV- and VG-information back to order, but only by (quite complicated) low-level methods. If it works with the high-level methods then these are preferable, so try these first.

I hope this helps.

bakunin

I found the lv that has this disk, but when I try to remove it , I get the following error

vios:/home/padmin$ lslv -l lg_dumplv
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
lg_dumplv:N/A
PVID              COPIES        IN BAND       DISTRIBUTION
0002ef4df616f969  008:000:000   100%          000:008:000:000:000
hdisk0            008:000:000   100%          000:008:000:000:000
vios:/home/padmin$

vios:/home/padmin$ rmlvcopy lg_dumplv 1 0002ef4df616f969
0516-076 lreducelv: Cannot remove last good copy of stale partition.
        Resynchronize the partitions with syncvg and try again.
0516-922 rmlvcopy: Unable to remove logical partition copies from
        logical volume lg_dumplv.
vios:/home/padmin$

vios:/home/padmin$ lslv -m lg_dumplv
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
lg_dumplv:N/A
LP    PP1  PVID1             PP2  PVID2             PP3  PVID3
0001  0116 0002ef4df616f969  0115 hdisk0
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
0002  0117 0002ef4df616f969  0116 hdisk0
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
0003  0118 0002ef4df616f969  0117 hdisk0
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
0004  0119 0002ef4df616f969  0118 hdisk0
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
0005  0120 0002ef4df616f969  0119 hdisk0
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
0006  0121 0002ef4df616f969  0120 hdisk0
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
0007  0122 0002ef4df616f969  0121 hdisk0
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
0008  0123 0002ef4df616f969  0122 hdisk0
vios:/home/padmin$

[quote=dn888;303001604]
I found the lv that has this disk, but when I try to remove it , I get the following error

vios:/home/padmin$ rmlvcopy lg_dumplv 1 0002ef4df616f969
0516-076 lreducelv: Cannot remove last good copy of stale partition.
        Resynchronize the partitions with syncvg and try again.
0516-922 rmlvcopy: Unable to remove logical partition copies from
        logical volume lg_dumplv.
vios:/home/padmin$


I hate to say it, but you are in deep kimchi. If you have any chance of reinstalling the VIOS - the virtual I/O server should hold no data itself anyway - this would probably be the best and cleanest recommendation.

As you get "5.3" from the oslevel command you are on a very old (and quite surely not any more supported) VIOS-version. The current versions are based on AIX 6.1. You can see the installed version either on the command line (the command is ioslevel) or from the HMC-GUI in the LPAR-listing display. This in itself is a danger and you should update ASAP, but i have a feeling you are going to tell me that this is not possible - right?

So you are bound to try your luck repairing the rootvg. The following is described in more detail at MichaelFelts page rootvg.net:

First you need the PVID and the VGID of the missing disk:

# lqueryvg -p <disk> -vPt

You need the displayed information later to delete the disk from the VG. Next remove the dump device which has allocated some PPs on the missing disk. Create a new dump device on the available disks if necessary.

Next you can remove the disk itself:

ldeletepv -v <VGID> -p <PVID>

NOTE that even if this is "best practice" this is a surgical operation and a certain risk is involved. A productive system, though - let alone a VIOS - is not the place to try anything even remotely risky! Do yourself a favour and have current backups (preferably mksysb -images on a NIM-server) ready before you begin!

If you have the hardware to do it: the less risky way to correct this is to create a mksysb-image and edit the /image.data file before creating it (see the command mkszfile ) for details). Then you restore this mksysb to some other hardware, take another mksysb and restore that to the original hardware, replacing the original. The advantage to this is that you can test, change and retest as long as you want before taking a final decision.

I hope this helps.

bakunin

2 Likes

0002ef4df616f969 is only referenced to lg_dumplv, no other lv uses it.

Aren't I able to just remove it from lg_dumplv or even destroy lg_dumplv and then remove 0002ef4df616f969?

What risk do I face if this disk 0002ef4df616f969 is removed from lg_dumplv ?

sysdumpdev -l shows that it is not using lg_dumplv

Yes. This is what i actually suggested:

bakunin

1 Like

Actually, we should have asked that you run the command:
ioslevel - to get the version of VIOS you are running.

As you say, AIX 5.3 - that would indicate VIOS 1.X something, as all VIOS 2.X are built on AIX 6.1.

Anyway - if your problem is/was dealing with a missing disk - once you are the root prompt this article (which I wrote nearly 10 years ago) - might give you some additional info. http://www.rootvg.net/content/view/174/309/

FYI: as you are talking about lg_dumplv I believe this (the VIOS) is an AIX 6.1 system - from memory the lg_dumplv did not exist, by default, on AIX 5.3. And the normal command, i.e., "(un)mirrorvg", sadly, seems to forget lg_dumplv.

1 Like

I managed to remove the disk from the VG

vios:/home/padmin$ lsvg -p rootvg
0516-304 : Unable to find device id 0002ef4df616f969 in the Device
        Configuration Database.
rootvg:
PV ID             PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
0002ef4df616f969  removed           546         538         110..101..109..109..109
hdisk0            active            546         454         109..89..38..109..109
hdisk3            active            546         469         109..104..38..109..109
vios:/home/padmin$
vios:/home/padmin$
vios:/home/padmin$ lqueryvg -p hdisk0 -vPt
Physical:       0002ef4df616f969                0   4
                0002ef4d0206744e                2   0
                0002ef4d02a2110a                1   0
VGid:           0002ef4d0000d60000000113f5ca39d1
vios:/home/padmin$
vios:/home/padmin$ lspv -l -v 0002ef4d0000d60000000113f5ca39d1 0002ef4df616f969
pvid=0002ef4df616f969:
LV NAME               LPs   PPs   DISTRIBUTION          MOUNT POINT
lg_dumplv             8     8     00..08..00..00..00    N/A
vios:/home/padmin$
vios:/home/padmin$ rmlv lg_dumplv
Warning, all data contained on logical volume lg_dumplv will be destroyed.
rmlv: Do you wish to continue? y(es) n(o)? y
rmlv: Logical volume lg_dumplv is removed.
vios:/home/padmin$
vios:/home/padmin$ lsvg -l rootvg
rootvg:
LV NAME             TYPE       LPs   PPs   PVs  LV STATE      MOUNT POINT
hd5                 boot       1     2     2    closed/syncd  N/A
hd6                 paging     4     8     2    open/syncd    N/A
paging00            paging     8     16    2    open/syncd    N/A
hd8                 jfs2log    1     2     2    open/syncd    N/A
hd4                 jfs2       2     4     2    open/syncd    /
hd2                 jfs2       13    26    2    open/syncd    /usr
hd9var              jfs2       5     10    2    open/syncd    /var
hd3                 jfs2       9     18    2    open/syncd    /tmp
hd1                 jfs2       31    62    2    open/syncd    /home
hd10opt             jfs2       2     4     2    open/syncd    /opt
fwdump              jfs2       1     2     2    open/syncd    /var/adm/ras/platform
dumplv              sysdump    7     7     1    open/syncd    N/A
vios:/home/padmin$
vios:/home/padmin$ ldeletepv -g 0002ef4d0000d60000000113f5ca39d1 -p 0002ef4df616f969
vios:/home/padmin$
vios:/home/padmin$ lqueryvg -p hdisk0 -vPt
Physical:       0002ef4d0206744e                2   0
                0002ef4d02a2110a                1   0
VGid:           0002ef4d0000d60000000113f5ca39d1
vios:/home/padmin$

Thank you all

You may want to consider using oem_setup_env to become root again, and to bring things back to what is expected (i.e., your sysdump parition back to same size and name as before). And also mirror it.

# chlvcopy -c 2 dumplv
# extendlv dumplv 1
# chlv -n lg_dumplv dumplv

Although - there may be a reason I long ago forgot - for not mirroring lg_dumplv -- Bakunin probably knopws for certain.