hacmp in a 7 node configuration ?

Hi Guys,

I have to design a multinode hacmp cluster and am not sure if the design I am thinking of makes any sense.
I have to make an environment that currently resides on 5 nodes more resilient but I have the constrain of only having 4 frames. In addition the business doesnt want to pay for more lpars than they have to. So I would like to go for a 5 active + 2 spare cluster with following setup:

A1 has higher need in resources so I would give S1 the same amount of resources A1 has and all others need about half as much resources, so S2 would have their amount.

Do I have to make all storage (about 3-5 TB / lpar) visible to all nodes or just to the nodes that are intended to takeover the resources.

Does this sound sensible or do I forget anything in this setup? Does anyone maybe run a similar setup and can share his experiences?

I appreciate your comments.

Kind regards
zxmaus

Hm not sure if HACMP will allow this kind of konfiguration. But tbh with clusters I would try to keep it as simple as possible so maybe have 1 backup for each of them. I know that this is no great help, but I would stay with 2 node clusters for each of the applications. No idea what kind of applications if HACMP + Oracle RAC is an option etc. or if you can consolidate any of the apps/prod nodes to get more LPARs as backup nodes.

How many locations are involved (servers and SAN)?

Hi Shockneck,

all is in the same datacentre and will be served by the same SAN infrastructure - EMC Symmetrix with Powerpath - Raid5 on the SAN site, dedicated Adapter pairs on the lpar site - no vio except for management purposes.

According to SAN Engineering no problem to make the disks all visible on each node required - even visibility on all 7 nodes is not a problem according to them.

This proposal is for one site only - we will have similar setups for UAT and COB in another datacentre in another country - with database replication between PROD and COB.

Kind regards
zxmaus

You'll have to make storage visible for prod nodes and backup nodes of course so they can takeover in case.
So zoning and masking should assure following:

S1 can see disks of A1, A4, A5
S2 can see disks of A2, A3, A5.

But I don't know if a node can be in more than 1 cluster which would be the case for S1 and S2. I guess Shockneck will tell :slight_smile:

Hi Zaxxon,

all these nodes are planned to be in the same cluster - I will have one more cluster as DR solution and one more cluster as uat - all of them have 7 lpars each if I am done :slight_smile:

I have just spoken to the SAN guys - they dont see any problems making all 7 nodes see the same disks ... one problem less :slight_smile:

Kind regards
Nicki

Current HACMP versions support clusters up to 32 nodes. Hence a combination of five active and two passive nodes is supported. However, from my point of view setting up this cluster is not the problem but operating it. HACMP clusters work great in a well defined environment that is thoroughly tested before going live and that is not changed after. However even in a stable environment there is administration work to do. Naturally it needs more effort to keep all software and microcodes on the same (latest) level on seven nodes. Different levels within a cluster are supported for a short time during node upgrade only (i.e. one or two days). It will also be more complicated to develop a sensible scenario for testing because a lot of different problems can be combined. For the same reason it will be more complicated to bring the cluster back to normal work during desaster recovery.
So for reasons of keeping the daily operation and the desaster recovery simple a cluster with less nodes is prefereable. For the same reason many clusters use a clear active-passive design as Zaxxon hinted. This is just what one ought to be aware of, I don't mean to hold you back. Personally I'd happily run such a seven node cluster under one condition: only trained HACMP admins have the root password. If your DBAs are allowed root on the cluster nodes - forget it.

Now to the technical details. A cluster node cannot be member of two different clusters. The Resource Group definitions are used to control which nodes are being used by which application. So within the cluster your A1 Resource Group would use node A1 and S1, A5's RG would use A5, S1 and S2 in this order. While you define RG's nodes in any order you like keep in mind that in a split brain condition nodes are sortet by their names alphanumericly to decide which node to power off. If you node's names are like A1, S1 and so on that should not turn out to be a problem.

I don't know by heart whether it is possible to use different zones to restrict access of shared VG disks to certain nodes. Cluster LVM operations require a PV being visible and accessible in any cluster node usually. If this presupposition is just for nodes that belong to the Resource Group you could use zones but otherwise all cluster's shared disks need to be visible in all cluster nodes. If using zones was possible that would however lead to hdisk numbers being duplicated while content was different. IMHO that would be a source of confusion. To avoid that I'd assign the disks one after the other to the seven nodes to see the same LUN with the same hdisk number in every node. So all hdiskpower devices have there reserve_lock set to no. (In December 08 I discovered a nasty bug in EMC powerpath that should have been fixed by March or April 09. The Box said that the reserve was off while in reality it was on. Your SAN colleagues probably are aware of that bug but anyway make sure you use the latest powerpath fixlevel!). You use Enhanced Concurrent Mode VGs and the Cluster is in control of who accesses which disk.

You are going to have non TCP/IP networks. These are point to point connections and you need to think about whether you want a heartbeat ring (over all frames) or one or more stars(from frame to frame). Probably you are going to use heartbeat over disk and thus use disk heartbeat devices. If you do you can use the data ECM VG for the heartbeat but you might think about using (very small e.g. 1PP) dedicated disk devices for that. While those disks (LUNs) use the same EMC box it might
make things more easy to see what is going on in the cluster plus the heartbeat works even while the Resource Group is offline. If you intend to use RS232 heartbeat you will very likely end up with a hearbeat ring.

These comments just for the points you mentioned so far. There is probably much more to think about and problems might arise during implementation.

Hi Shockneck,

many many thanks for your reply. You have answered all my questions so far.

  • DBAs don't have root access to the cluster
  • only 2 SAs will have root access - me and my backup - and we will both build the cluster together
  • we have a very strong change control in place and our documentation is pretty good
  • we are aware this is no 24/7/365 solution - that is why we have an independant DR recovery site and a uat environment for testing
  • the A1-A5 servers + 1 spare are running right now in a VCS cluster on AIX ... our engineering doesnt support it past 9/2010 - that is basically why we're going to move it to hacmp
  • we will have almost a year for build and testing
  • we have a weekend greenzone of 36 hrs for the environment and always patch all nodes in any cluster at the same time
  • we will have hartbeat on disk on at least 2 different mini-devicegroups as you suggested

Once more thank you for all your suggestions - I will keep you in the loop if problems arise as I can see you don't have any major objection :slight_smile:

Thank you in addition for the EMC bug tip - this explains why we have the problems on the 2 node clusters ... our SAN engineering is probably aware but has not yet certified the fix - I will ask them on monday for the fix.

Kind regards
zxmaus