Cannot extend logical volume

lckdanny · November 23, 2016, 2:54am

Dear All,

We have an Oracle running on AIX 6.1, now there have an disk spaces issue and we found that we cannot perform to extend the logical volume. Could help to review and make some suggestions.

Status:
we already provisioned some disks to the host, new disks can be scan and add into the volume group, issue just happened during extend the logical volume.

we try to use chlv -x 72174 oravol used PP is 63984, new PP is 8190.

Is that related to the policy "striped" not "parallel"?

Thanks.

# lslv oravol
LOGICAL VOLUME:     oravol               VOLUME GROUP:   datavg
LV IDENTIFIER:      0006afbb0000d40000000141a0c9e1da.1 PERMISSION:     read/write
VG STATE:           active/complete        LV STATE:       opened/syncd
TYPE:               jfs2                   WRITE VERIFY:   off
MAX LPs:            68847                  PP SIZE:        128 megabyte(s)
COPIES:             1                      SCHED POLICY:   striped
LPs:                63984                  PPs:            63984
STALE PPs:          0                      BB POLICY:      relocatable
INTER-POLICY:       maximum                RELOCATABLE:    no
INTRA-POLICY:       middle                 UPPER BOUND:    16
MOUNT POINT:        /oradata         LABEL:          /oradata
MIRROR WRITE CONSISTENCY: on/ACTIVE
EACH LP COPY ON A SEPARATE PV ?: yes (superstrict)
Serialize IO ?:     NO
STRIPE WIDTH:       16
STRIPE SIZE:        64k

The new disks is hdiskpower33 and hdiskpower34

# lspv |grep datavg
hdiskpower5     0006afbb58e1bb54                    datavg          concurrent
hdiskpower6     0006afbb58ddca06                    datavg          concurrent
hdiskpower7     0006afbb58e8aa28                    datavg          concurrent
hdiskpower8     0006afbb58e44ce2                    datavg          concurrent
hdiskpower10    0006afbb58dfe7b1                    datavg          concurrent
hdiskpower11    0006afbb58eae332                    datavg          concurrent
hdiskpower12    0006afbb58e67454                    datavg          concurrent
hdiskpower13    0006afbb58e1fee0                    datavg          concurrent
hdiskpower17    0006afbb58e7931e                    datavg          concurrent
hdiskpower20    0006afbb58e343af                    datavg          concurrent
hdiskpower23    0006afbb58dedc7c                    datavg          concurrent
hdiskpower26    0006afbb58e9af27                    datavg          concurrent
hdiskpower29    0006afbb58e55f6d                    datavg          concurrent
hdiskpower30    0006afbb58e11a03                    datavg          concurrent
hdiskpower31    0006afbb58ebd99c                    datavg          concurrent
hdiskpower32    0006afbb58e772e6                    datavg          concurrent
hdiskpower33    0006afbb9133c1e3                    datavg          concurrent
hdiskpower34    0006afbb9139d057                    datavg          concurrent

new disks added into the volume group, status as below

# lsvg datavg
VOLUME GROUP:       datavg                   VG IDENTIFIER:  0006afbb0000d40000000141a0c9e1da
VG STATE:           active                   PP SIZE:        128 megabyte(s)
VG PERMISSION:      read/write               TOTAL PPs:      72942 (9336576 megabytes)
MAX LVs:            512                      FREE PPs:       8958 (1146624 megabytes)
LVs:                1                        USED PPs:       63984 (8189952 megabytes)
OPEN LVs:           1                        QUORUM:         10 (Enabled)
TOTAL PVs:          18                       VG DESCRIPTORS: 18
STALE PVs:          0                        STALE PPs:      0
ACTIVE PVs:         18                       AUTO ON:        no
Concurrent:         Enhanced-Capable         Auto-Concurrent: Disabled
VG Mode:            Concurrent
Node ID:            1                        Active Nodes:
MAX PPs per VG:     127000
MAX PPs per PV:     5080                     MAX PVs:        25
LTG size (Dynamic): 1024 kilobyte(s)         AUTO SYNC:      no
HOT SPARE:          no                       BB POLICY:      relocatable
PV RESTRICTION:     none

PP status

# lsvg -p datavg
datavg:
PV_NAME           PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION
hdiskpower32      active            4095        96          00..00..00..00..96
hdiskpower13      active            4095        96          00..00..00..00..96
hdiskpower31      active            4095        96          00..00..00..00..96
hdiskpower12      active            4095        96          00..00..00..00..96
hdiskpower30      active            4095        96          00..00..00..00..96
hdiskpower11      active            4095        96          00..00..00..00..96
hdiskpower29      active            4095        96          00..00..00..00..96
hdiskpower10      active            4095        96          00..00..00..00..96
hdiskpower26      active            3999        0           00..00..00..00..00
hdiskpower8       active            3999        0           00..00..00..00..00
hdiskpower23      active            3999        0           00..00..00..00..00
hdiskpower7       active            3999        0           00..00..00..00..00
hdiskpower20      active            3999        0           00..00..00..00..00
hdiskpower6       active            3999        0           00..00..00..00..00
hdiskpower17      active            3999        0           00..00..00..00..00
hdiskpower5       active            3999        0           00..00..00..00..00
hdiskpower33      active            4095        4095        819..819..819..819..819
hdiskpower34      active            4095        4095        819..819..819..819..819

Bartolomeus · November 23, 2016, 3:21am

Hi danny,
try:

extendlv oravol number_of_PPs

lckdanny · November 23, 2016, 3:49am

Dear Bartolomeus,

Thanks for your support.

we try this before, the result as below. I don't know why is said not enough on hdiskpower6, as we checked, hdiskpower6 don't have enough space. Can we force to extend the size on the new disks? both error message on using smitty and command line.

# extendlv oravol 8176
0516-1034 lquerypv: Not enough physical partitions in physical volume hdiskpower6.
0516-788 extendlv: Unable to extend logical volume.

_XrAy · November 23, 2016, 7:31am

Yes!
You need enough space on all your 16 devices. lslv -l oravol will show you the affected devices.

Regards

PS
In your case the minimum extend size is 16*64K.

man extendlv
...
       Note:
       1    When extending a striped logical volume, the number of partitions must be in an even multiple of the striping width.
...

rbatte1 · November 23, 2016, 10:08am

Are these local disks or hardware protected in some way? (SAN provided, RAID device etc.)

The reason I ask is that you have a single copy of each PP. On real hardware you might lose the LV if any disk fails. If it is hardware protected, then you might be causing yourself an IO overhead by striping. I know it sounds counter-intuitive, but I've seen issues where spreading IO according to how the OS sees it can cause contention on the real disks when a SAN also spreads the IO. Bizarrely we improved IO when we tried to create hot-spots as the OS saw it because the SAN then really did spread the heavy IO properly.

Can you explain a little more about what hardware you have in play?

Thanks,
Robin

vbe · November 23, 2016, 10:53am

we dont have an

lsvg -l

either to unterstand how it is organised ( mirror between to bays? etc... ) And I may be wrong as now I use only mirror pools

It looks like you are in morror and yes you added new disks but are they one in each mrirror copy etc...

If you are stripped with strict policy you are stuck...you will have to add as many disks it needs to respect the stripping policy, but one way perhaps to see if true would be to do a

 reorgvg datavg

as if the stripping is not strict, it will move blocks to unused disks and free the ones completely full, beware if this have never been done before ( running tht command...) it can take quite some time ( in hours...)
If it worked, you can try

 chfs -a size=+2000M /oradata

and see if that works.. if so you are a happy guy...

rbatte1 · November 23, 2016, 11:41am

I don't think that there are mirrors in play for this LV. On the lslv oravol we have LPs matching PPs and COPIES value of 1.

This could well be SAN provided, especially if it is Oracle RAC in play as suggested by these lines from the lsvg

:
Concurrent:         Enhanced-Capable         Auto-Concurrent: Disabled
VG Mode:            Concurrent
:

I'm not sure if reorgvg will work in Enhanced-Capable or not. It might well be worth stopping all but one node of the cluster (assuming it is clustered)

Robin

agent.kgb · November 23, 2016, 6:15pm

chlv -u 18
chlv -x 75000
extendlv
chfs

The problem is, that you have upper bound = 16 in your LV configuration. It means your LV can be maximum on 16 physical volumes. It seems, that the disks, used by the LV, already full and either you must free up some space on these physical volumes by moving other logical volumes, or you change the LV configuration and allow it to span on the whole 18 physical volumes, you have in your volume group.

bakunin · November 25, 2016, 3:37pm

I suggest to get rid of the ridiculous striping at all. Striping is a good idea if you hae physical disks and want to spread the load over all of them so that the overall response time of the (disk sub-)system gets better. It makes absolutely no sense at all with SAN disks (from the names of the hdisk devices i suppose you have an EMC storage).

To tell you the bad news up front: you will need a downtime to do this because it means deleting and recreating the LV. Still it is a good idea to do so because the further administration will be way easier once you did it.

I hope this helps.

bakunin

Scrutinizer · November 26, 2016, 3:09am

Indeed as a rule the combination of lvm striping with high end SAN storage (double or triple striping) is to be avoided, both for reasons of simplicity and performance.

Remarkably perhaps, I have come across situations, where lvm striping actually did make a serious performance difference (improvement) with high end SAN Storage and sequential read IO, but only with a narrow width (say 4-8) at small stripe sizes (128KiB-256KiB).

This was because there was a front-end bottle neck and at the same it was difficult to increase IO queue sizes, which is extra important because of the serial nature of Fibre Channel SANs (the bottle neck could even be observed when the cache-hit ratio was at 100% at the front-end storage level).

By using narrow striping it was possible to increase the effective IO queue size, while at the same time not confusing the prefetch algorithms of the SAN storage. The LUN's in the LVM narrow stripe had to be from different physical disk sets in the backend SAN Storage level, in case of a storage array architecture where this would make a difference). At the SAN storage, this translated in a nice even spread of backend usage, without hot spots.

Situations where this mattered was with databases with quite a bit of sequential read IO. This happened with Oracle databases that were never fully optimized, because the standard query specifications kept changing, which in my experience is the situation that occurs often. Another situation is when out of necessity reports or other batches need to run during on-line usage.

Conversely, I have come across a situation where a large stripe size was used (4 MiB) with a large stripe width (16) and that really confused storage, thwarting the prefetch algorithms, and all IO was done with small sizes, bringing sequential read IO to a crawl, while the storage processors were working overtime.

So as usual in performance tuning: "it depends...."

lckdanny · November 28, 2016, 5:16am

Thanks for your help.

the output for lslv -l oravol

PV                COPIES        IN BAND       DISTRIBUTION
hdiskpower32      3999:000:000  20%           819:819:819:819:723
hdiskpower13      3999:000:000  20%           819:819:819:819:723
hdiskpower31      3999:000:000  20%           819:819:819:819:723
hdiskpower12      3999:000:000  20%           819:819:819:819:723
hdiskpower30      3999:000:000  20%           819:819:819:819:723
hdiskpower11      3999:000:000  20%           819:819:819:819:723
hdiskpower29      3999:000:000  20%           819:819:819:819:723
hdiskpower10      3999:000:000  20%           819:819:819:819:723
hdiskpower26      3999:000:000  20%           800:800:799:800:800
hdiskpower8       3999:000:000  20%           800:800:799:800:800
hdiskpower23      3999:000:000  20%           800:800:799:800:800
hdiskpower7       3999:000:000  20%           800:800:799:800:800
hdiskpower20      3999:000:000  20%           800:800:799:800:800
hdiskpower6       3999:000:000  20%           800:800:799:800:800
hdiskpower17      3999:000:000  20%           800:800:799:800:800
hdiskpower5       3999:000:000  20%           800:800:799:800:800

---------- Post updated at 04:09 PM ---------- Previous update was at 03:32 PM ----------

Dear Robin,

Thanks for your support.

It's using SAN disks, and each LUN around 512GB and seperated into 8 individual raid group (RAID5), it's using a EMC VNX storage. The setup engineer was gone, so we don't have enough information under OS level.

Thanks.

---------- Post updated at 04:11 PM ---------- Previous update was at 04:09 PM ----------

vbe:

we dont have an
lsvg -l 
either to unterstand how it is organised ( mirror between to bays? etc... ) And I may be wrong as now I use only mirror pools

It looks like you are in morror and yes you added new disks but are they one in each mrirror copy etc...

If you are stripped with strict policy you are stuck...you will have to add as many disks it needs to respect the stripping policy, but one way perhaps to see if true would be to do a reorgvg datavg as if the stripping is not strict, it will move blocks to unused disks and free the ones completely full, beware if this have never been done before ( running tht command...) it can take quite some time ( in hours...)
If it worked, you can try
 chfs -a size=+2000M /oradata 
and see if that works.. if so you are a happy guy...

Dear VBE,

Thanks for your advices.

If found some information on the internet that said need to use reorgvg , but I didn't use it before, so I don't know the risk and impact.

We just want to know is reorgvg is the only way under our situation!! Thanks.

---------- Post updated at 04:15 PM ---------- Previous update was at 04:11 PM ----------

Dear agent.kgb,

we did the change the upper bound to 18 but failed. error message seem need to set the multiple of stripe width.

# chlv -u 18 oravol
0516-1441 chlv: Striped logical volume upperbound can only be an even multiple of the striping width.
0516-704 chlv: Unable to change logical volume oravol.

---------- Post updated at 06:16 PM ---------- Previous update was at 04:15 PM ----------

Dear Bakunin,

Thanks for your support.

Yes, we are using EMC VNX as storage box.
As you said that, when using SAN, there have no improvement to use striping?

For my understanding is that, striping is use for pool I/O (maybe not using SAN), let more disks work on I/O, to re-balance the I/O. But once use SAN, storage pool will balance the I/O into all disks. Am i right?

Thanks.

bakunin · November 28, 2016, 2:43pm

Basically and in most cases: yes. There are some notable exceptions to this rule (see Scrutinizers post #10 for such exceptions), but in general: what you try to achieve with disk striping a modern SAN box already does itself internally. There is no sense in doing it twice. If you are particularly unlucky (well, i agree, this is more a theoretical possibility) your own striping and the striping of the SAN box will overlay and create a Moir�-like effect that de-stripes your disk access.

I have once written a lengthy article about performance tuning, which i suggest you to read. Maybe it answers a few questions you might have.

The VNX is a small platform and i haven't worked with it but i suppose its frontend is not all that sophisticated. Therefore it might be worthwile to examine other aspects of disk access as well if the need of performance tuning arises: queue sizes, the distribution of block sizes in your typical load, data hotspots (maybe suggesting multitiered disk architectures with SATA-disks on one end and FC-disks or even SSDs on the other) or some other measures.

As Scrutinizer said so rightly: in performance tuning it always depends and one size never fits all.

I hope this helps.

bakunin