HACMP - RG move issue.

gowthamakanthan · October 29, 2011, 4:44am

Hello Everyone,

Hope you all doing well. I have configured a powerHA with 2 RG. Whenever am moving a RG to the 2 node, the 2 node becoming unavailable. But its working fine in the console also cluster not having any issues.  Once i ran the tcp/ip configuration via smitty , the node becoming available.

 As per the client I have configured only 2 ip's for the cluster. One public and the one for service\(hacmp\) ip. Please assist.

Thanks,
Gowtham.G

phobus · October 29, 2011, 6:22am

Hi,

could you run and post output here please?

lssrc -ls clstrmgrES

cldump

clshowsrv -v

cllscf

cllsnw

cllsif

cat /path_to_your_/hacmp.out | egrep -p "FAIL|ERROR" | more

and run as well verification of cluster from primary node via

smit hacmp -> Problem Determination Tools -> HACMP Verification -> Verify HACMP Configuration

and then

cat /path_to_your_/clverify.log | egrep -p "FAIL|ERROR" | more

Thank you.

gowthamakanthan · October 29, 2011, 6:39am

Hello Phobus,

Thanks for the reply. Please check the attachment for the mentioned command outputs.

Gowtham.G

phobus · October 29, 2011, 7:34am

Thank you for that file.

Could you start clinfoES with (on both nodes this needs to be done)

startsrc -s clinfoES

and check if snmpd is running? (on both nodes this needs to be done)

lssrc -s snmpd

if not start it up and wait for while and you should be able use

cldump

and

clstat

Could you post output from that cldump please?

About the log files. That's strange because movement RG failed and there is no entry about it. Are you checking right file. You can have two hacmp.out files in different location and just one is being used. Please check with

odmget HACMPlogs | grep -p hacmp.out

and please check which has latest time-stemp on it. There should be error entry. You need to check this file on node from which executing movement of RG.

Verification is fine that's good but could you check this file

/usr/es/sbin/cluster/etc/rhosts

if it's same on both nodes? More info about that file here and here

---------- Post updated at 12:34 PM ---------- Previous update was at 12:22 PM ----------

I read again your first post. What do you mean by "the 2 node becoming unavailable"?

Is movement of RG being done successfully to another node?

gowthamakanthan · October 29, 2011, 7:58am

Hello Phobus,

 I have attached the mentioned details. I have identified whenever I move the RG to the another node,there is change in the default gateway on the another node\(which got the RG\) \{output is in the file\} . And the clstat, cldump are seems not configured properly.  Kindly check the attached files and advice. Thanks much.

Movement of the RG is successful and its running fine, but the node which taken over the RG becoming unavailable from the other subnet. I hope its because of the gateway issue. Whenever its taking over the RG it's losing it's default gateway info.

phobus · October 29, 2011, 8:46am

Hi gowthamakanthan,

the clinfo-log file are those new errors? Or are they old entries?

I assume they are old entries since you have no issue with moving RG but
could you double check that you have /etc/hosts on both nodes identical?

I don't know why default GW after RG is moved is removed from routing table. It's beyond my knowledge, unfortunately.

gowthamakanthan · October 29, 2011, 9:33am

Appologies. I have attached the wrong hacmp out logs. Please check the attachment for the correct one.

About the host file, both are identical . Still the issue persist.

gowthamakanthan · October 30, 2011, 2:03am

The issue has been fixed. The gateway issue because of the App and DB start up scripts. Those scripts are deleting the default gateway while starting up.

Special thanks to Phobus .

phobus · October 30, 2011, 7:28am

I'm glad you got it fixed , btw what version of hacmp do you have? I'm just wondering because hacmp.out log looks different comparing to what I used to see on hacmp 5.X version.

gowthamakanthan · October 30, 2011, 9:24am

We are using PowerHA 6.1.