Not sure if this is the correct forum to post this on but maybe a mod could move it if not.
When trying to move a HACMP resource group between lpars on AIX I receive the following.
State not STABLE/RP_RUNNING or ibcasts Join for node 2 rejected,
Clearing in join protocol flag
Attempting to recover resource group from error
"Resource group not found in client configuration"
echo '+BrokerMB02rg:clvaryonvg prmb02vg[808]' LC_ALL=C 0516-052 varyonvg: Volume group cannot be varied
on without a quorum. More physical volumes in the group must be active. Run diagnostics on inactive PVs.
When these errors are received the resource group then starts back up on the original node.
I have checked the LV's and Quorum is disabled and there are nbo stale PV's
Any help or pointers in the right direction would be much aprreciated.
I would try to export and reimport the volumegroup in question on the inactive node. Make sure you keep the correct VG Major number (you can import using the -V flag).
If it finds all PVs you should just sync the cluster config and than try again. If it has problems during the import you can take it from there. I had similar issues after migrating storage across disks - duplicate pvid's on some disks - the cluster did not like that much.
Hello
on the inactive node nothing should be mounted from that volumegroup - so all you need to do is
lspv - copy the output to somewhere so you know which disks belong to the volumegroup
exportvg yourvolumegroupname
importvg -V volumegroup major number -Ry volumegroupname PVID from any of the disks belonging to this volumegroup (you can look up the major number on the active node doing ls -ali /dev | grep volumegroupname - the major and minor number is stated beside the name - this import should happen hopefully without any errors - if you get an error than post it here so we can follow up on that
cluster synchronization is part of the hacmp menus in smitty - look under Extended Configuration ...
check the lun's reserve policy (lsattr -El hdiskx)
should be set to no_reserve, if not set it
chdev -l hdiskx -a reserve_policy=no_reserve
it's always a good idea to try out manually what the cluster is doing
normally there is no need to set the vg online, since it should be concurrent passive online, as soon as the cluster starts
when the resource group moves, the vg will be set to concurrent active on one node, and concurrent passive on the other node
ist the vg concurrent capable? (lsvg vgname)
VOLUME GROUP: xxxvg VG IDENTIFIER: xxxxxx
VG STATE: active PP SIZE: 128 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 67478 (8637184 megabytes)
MAX LVs: 256 FREE PPs: 1078 (137984 megabytes)
LVs: 31 USED PPs: 66400 (8499200 megabytes)
OPEN LVs: 31 QUORUM: 1 (Disabled)
TOTAL PVs: 34 VG DESCRIPTORS: 34
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 34 AUTO ON: no
Concurrent: Enhanced-Capable Auto-Concurrent: Disabled
VG Mode: Concurrent
Node ID: 1 Active Nodes: 2
MAX PPs per VG: 131072 MAX PVs: 1024
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
should look like this
vg state may be active or passive
lsvg xxx02vg
VOLUME GROUP: xxx02vg VG IDENTIFIER: 00c58aa200004c0000000124fdaf434a
VG STATE: active PP SIZE: 32 megabyte(s)
VG PERMISSION: read/write TOTAL PPs: 1278 (40896 megabytes)
MAX LVs: 256 FREE PPs: 444 (14208 megabytes)
LVs: 8 USED PPs: 834 (26688 megabytes)
OPEN LVs: 8 QUORUM: 1 (Disabled)
TOTAL PVs: 2 VG DESCRIPTORS: 3
STALE PVs: 0 STALE PPs: 0
ACTIVE PVs: 2 AUTO ON: no
MAX PPs per VG: 32512
MAX PPs per PV: 1016 MAX PVs: 32
LTG size (Dynamic): 256 kilobyte(s) AUTO SYNC: no
HOT SPARE: no BB POLICY: relocatable
you might - if you anyways change your VG - want to consider as well big or scalable VG - you are using rather small luns and most applications / DBs grow a lot over time. Going to scalable right in the beginning makes sure you don't get into trouble when you have to add further luns later (more than 32). Apart from that, it is the only way, your OS will be capable to keep ownerships of special files after an export/import - which would be particularly important if you maybe use sybase. If you choose scalable, you will not even have to care about running out of inodes ever.
The cluster has already been setup and has been working fine up until I had the errors on the weekend trying to move the resource group online on another node.