Setting up HACMP 6.1 on a two node cluster. The other node works fine and can start properly on STABLE state (VGs varied, FS mounted, Service IP aliased). However, the other node is always stuck on ST_JOINING state. Its taking forever and you can't stop the cluster as well or recover from script failure. I can't see any error from hacmp.out.
Here's the latest error I see from clstrmgr.debug (This is from console so I just type it here:)
getPriorityOverride: Returning 0 for the nodehandle:2
getPriorityOverrideSecondary: Returning 0 for the nodehandle:2
rm_CreateAllPolMsg: node NODE02 has group RG1 in node 1
getPriorityOverride: Reutrning 0 for the nodehandle:4
getPriorityOverrideSecondary: Returning 0 for the nodehandle:4
Before Sending: Message Length is 4232 NumResStates:2NumPols:2 numSSitePols:0 join_data_valid:0
rm_ProcessnPhaseCb: Voting to CONTINUE my join w/msg.seq_no1 packet_count:
---------- Post updated at 03:52 AM ---------- Previous update was at 03:47 AM ----------
Also one thing to add is that, I can start the cluster on any node as long as I have not started any other node. Meaning I can start the cluster and RG on either node1 and node2 but If I start it on node1, node2 won't bring up by clstart and shows as ST_JOINING forever. Thus, I cannot do a failover to other node unless the other node is in stable state.