Cluster 6.1 storage move

snchaudhari2 · July 5, 2015, 11:55am

HI AIX experts,

We have a 2-node cluster version 6.1. Running 7 oracle dbs. Everything is on symmetrix. We are going to have a storage migration from one array to another. There will be SRDF setup between two arrays. I will be getting all the luns from new array. I am just looking for the cluster steps for this migration to be successful. Can someone please share the overall steps to achieve this to avoid any corruption or cluster in consistent state and working properly.

Thanks a lot in advance...

bakunin · July 5, 2015, 5:21pm

You can do it with storage methods (SRDF), but i would do it with simple AIX means. This method will require no downtime at all, you can do everything during service time:

Create new LUNs from the new storage equal in size to the set of disks you use now and let the "cfgmgr" run on both nodes.
Then add the new disks to their pairwise counterparts on th active node ("extendvg") and use "mirrorvg" to set up a new mirror.
When the disks are synced do a "importvg -L" (learning import) on the inactive node from the old disks.
On the active node do a "unmirrorvg" and a "reducevg" to the old disks.
On the inactive node do a "importvg -L" again, this time from the new disks. VG information is now consistent again.
remove old disks from the configuration. Once you have them physically disconnected run "cfgmgr" again on both nodes.

Here is an example run: vg1 has 1 disk "hdisk1" (100G) from the old storage.

Create a new LUN from the new storage, which is 100G too. Run "cfgmgr" on both nodes. Say, the new disk is hdisk11.

Create the mirror: Notice that i postpone the syncing because this way i can sync several VGs at once and i can parallelize the syncing (see "-L" switch to "syncvg").

root@activenode # extendvg vg1 hdisk11
root@activenode # mirrorvg -s vg1 hdisk11
root@activenode # syncvg -P 32 -v vg1 &

Make configuration known on second node:

root@passivenode # importvg -L vg1 hdisk1

Remove original disk:

root@activenode # unmirrorvg vg1 hdisk1
root@activenode # reducevg vg1 hdisk1

Make final config known again on passive node:

root@passivenode # importvg -L vg1 hdisk1

I hope this helps.

bakunin

snchaudhari2 · July 5, 2015, 7:44pm

Hi Bakunin, Thnx for detailed response.

The plan that our architect implemented in summary is :
shutdown the source servers,
replicate the old luns with new luns,
remove the old luns and allocate new luns.
Bring the servers up.

Here, I wanted to know how can I implement the cluster activities, I mean 1. what I will have to do before shutdown and 2. what after it came up with new luns.

bakunin · July 6, 2015, 5:38am

The problem with this plan is:

the new LUNs will have different WWNs and therefore the volume group information of the system will not be consistent. Basically you will have to build the cluster resources anew.
You need a (complete) downtime to do this as your storage guy planned. What i described will need no downtime at all. To reconfigure all the VGs and RGs plan at least 1-2 hours (30 min to actually do it, 30-90 min in contingency) and you will not know if everything went well until you do an "extended verification and synchronisation". (Technically speaking, you will be creating new VGs, which just happen to have the same name as the old ones and you will build new ressource groups from them, which happen to have the same name as the old ones.)

I hope this helps.

bakunin

snchaudhari2 · July 7, 2015, 4:43pm

Yes Bakunin, We do have downtime scheduled.

I don't think they will agree with OS level mirroring, there is too much data in TB which they are agree with SRDF to come through.

agent.kgb · July 8, 2015, 9:30am

find another architects! these I would personally pillory after the procedure you will receive new LUNs with the same PVIDs on the AIX. The best way in this case is to shutdown the cluster, remove volume groups from the cluster resource groups, export all volume groups to be synced, remove all disks which are in these volume groups from the system, de-zone the LUNs, let them (these funny guys which think they are architects) synchronize the storage, zone new LUNs to the server, configure them (cfgmgr and disk parameters if you need them), import volume groups, rediscover everthing on both sides of the cluster, update the cluster resource group definitions, start the cluster.

I hope, I didn't forget something very important, but from my point of view it is easier to kill the architects and save online time for your users. Implement the solution of bakunin - it is the right one! The only right solution.

bakunin · July 8, 2015, 2:59pm

This is high praise from an expert. I will cherish that. The problem with the duplicate PVIDs didn't even cross my mind, but you are right: this is going to be a problem indeed. One more reason - if avoiding the bloodshed on the architects isn't convincing to thread-o/p enough.

I hope this helps.

bakunin

snchaudhari2 · July 11, 2015, 7:22pm

Thanks a lot all. Really appreciate. Will take this to my team.