Modifying properties to use scsi-3 reservation for disks with two paths in sun cluster

Hi

I am having two node cluster. I was trying to change the to use scsi3 reservation for all disks having two paths. Please let me know where I went wrong or there any issue to implement the same.

On node1

bash-3.2# cldev status
 === Cluster DID Devices ===
 Device Instance               Node              Status
---------------               ----              ------
/dev/did/rdsk/d1              node2             Ok
 /dev/did/rdsk/d10             node1             Ok
                              node2             Ok
 /dev/did/rdsk/d11             node2             Ok
 /dev/did/rdsk/d12             node1             Ok
 /dev/did/rdsk/d14             node1             Ok
 /dev/did/rdsk/d3              node2             Ok
 /dev/did/rdsk/d4              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d5              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d6              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d7              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d8              node2             Ok
 /dev/did/rdsk/d9              node1             Ok
                              node2             Ok
 bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2# clq status
 === Cluster Quorum ===
 --- Quorum Votes Summary from latest node reconfiguration ---
             Needed   Present   Possible
            ------   -------   --------
            2        3         3
 
--- Quorum Votes by Node (current status) ---
 Node Name       Present       Possible       Status
---------       -------       --------       ------
node2           1             1              Online
node1           1             1              Online
 
--- Quorum Votes by Device (current status) ---
 Device Name       Present      Possible      Status
-----------       -------      --------      ------
d6                1            1             Online
 bash-3.2#
bash-3.2# cldev list -v
DID Device          Full Device Path
----------          ----------------
d1                  node2:/dev/rdsk/c0d0
d2                  node2:/dev/rdsk/c1t0d0
d3                  node2:/dev/rdsk/c0d1
d4                  node2:/dev/rdsk/c2t600144F055753E0900080027FCD22200d0
d4                  node1:/dev/rdsk/c2t600144F055753E0900080027FCD22200d0
d5                  node2:/dev/rdsk/c2t600144F055753E0100080027FCD22200d0
d5                  node1:/dev/rdsk/c2t600144F055753E0100080027FCD22200d0
d6                  node2:/dev/rdsk/c2t600144F055753DD800080027FCD22200d0
d6                  node1:/dev/rdsk/c2t600144F055753DD800080027FCD22200d0
d7                  node2:/dev/rdsk/c2t600144F055753DD100080027FCD22200d0
d7                  node1:/dev/rdsk/c2t600144F055753DD100080027FCD22200d0
d8                  node2:/dev/rdsk/c3t0d0
d9                  node2:/dev/rdsk/c2t600144F055753DE500080027FCD22200d0
d9                  node1:/dev/rdsk/c2t600144F055753DE500080027FCD22200d0
d10                 node2:/dev/rdsk/c2t600144F055753DED00080027FCD22200d0
d10                 node1:/dev/rdsk/c2t600144F055753DED00080027FCD22200d0
d11                 node2:/dev/rdsk/c3t1d0
d12                 node1:/dev/rdsk/c0d0
d13                 node1:/dev/rdsk/c1t0d0
d14                 node1:/dev/rdsk/c0d1
bash-3.2# metaset
 Set name = test-set, Set number = 1
 Host                Owner
  node1              Yes
  node2
 Driv Dbase
 d7   Yes
 d9   Yes
bash-3.2#
bash-3.2#
bash-3.2# clq add d4
bash-3.2#
bash-3.2# clq status
 === Cluster Quorum ===
 --- Quorum Votes Summary from latest node reconfiguration ---
             Needed   Present   Possible
            ------   -------   --------
            3        4         4
 
--- Quorum Votes by Node (current status) ---
 Node Name       Present       Possible       Status
---------       -------       --------       ------
node2           1             1              Online
node1           1             1              Online
 
--- Quorum Votes by Device (current status) ---
 Device Name       Present      Possible      Status
-----------       -------      --------      ------
d6                1            1             Online
d4                1            1             Online
 bash-3.2#
bash-3.2#
bash-3.2# clq remove d4
bash-3.2#
bash-3.2# clq status
 === Cluster Quorum ===
 --- Quorum Votes Summary from latest node reconfiguration ---
             Needed   Present   Possible
            ------   -------   --------
            2        3         3
 
--- Quorum Votes by Node (current status) ---
 Node Name       Present       Possible       Status
---------       -------       --------       ------
node2           1             1              Online
node1           1             1              Online
 
--- Quorum Votes by Device (current status) ---
 Device Name       Present      Possible      Status
-----------       -------      --------      ------
d6                1            1             Online
 bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2# cluster show|grep global_fencing
  global_fencing:                                  pathcount
bash-3.2#
bash-3.2# cldev show |grep default_fencing
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
bash-3.2#
bash-3.2# cldev show d6 |grep default_fencing
  default_fencing:                                 his
bash-3.2#
bash-3.2#
bash-3.2# cluster set -p global_fencing=prefer3
Warning: Device instance d6 is a quorum device - fencing protocol remains PATHCOUNT for the device.
Warning: Device instance d8 supports SCSI-2 only and set to use pathcount fencing protocol.
Warning: Device instance d11 supports SCSI-2 only and set to use pathcount fencing protocol.
Updating shared devices on node 1
Updating shared devices on node 2
bash-3.2#
bash-3.2#
bash-3.2# cluster show|grep global_fencing
  global_fencing:                                  prefer3
bash-3.2#
bash-3.2# 
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 pathcount
  default_fencing:                                 global
  default_fencing:                                 pathcount
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 pathcount
  default_fencing:                                 global
  default_fencing:                                 global
  default_fencing:                                 global
bash-3.2#
bash-3.2#
bash-3.2# cldev show d6 |grep default_fencing
  default_fencing:                                 pathcount
bash-3.2# cldev show d8 |grep default_fencing
  default_fencing:                                 pathcount
bash-3.2# cldev show d11 |grep default_fencing
  default_fencing:                                 pathcount
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2# clq show d6
 === Quorum Devices ===
 Quorum Device Name:                             d6
  Enabled:                                         yes
  Votes:                                           1
  Global Name:                                     /dev/did/rdsk/d6s2
  Type:                                            shared_disk
  Access Mode:                                     scsi2
  Hosts (enabled):                                 node2, node1
 bash-3.2#
bash-3.2#
bash-3.2# clq add d4
 ^C
^C
^C

Now above command hanged & no # prompt was coming on the terminal.
In above process, I want to follow process where I was trying to add first another quorum d4 & then remove the d6 quorum & then change the default_fencing properties of d6 & then add the d6 as quorum & then remove the d4 as quorum. Moreover I am not concern about d8 & d11 as those are local disk . After above process scsi3 reservation was implemented properly.

Need to abruptly shutdown both the nodes as even cluster shutdown comman was not working.

While rebooting both the nodes as nothing was working after running last command, I am getting following the messages on both nodes.

NOTICE: CMM: Cluster Doesn't have operational quorum yet; waiting for quorum

Then I used the following procedure on both nodes to recover my cluster where I deleted quorum device d4 entries from infrastructure file & reboot the nodes. After rebooting the nodes in cluster I wanted to follow another procedure where I will put the cluster in installed mode & remove d6 from quorum device & change it default_fencing properties to global & then add d6 as quorum device.

On node1

login as: root
Using keyboard-interactive authentication.
Password:
Last login: Fri Sep  4 15:06:58 2015 from 192.168.11.85
Oracle Corporation      SunOS 5.10      Generic Patch   January 2005
# bash
 bash-3.2#
bash-3.2# cp /etc/cl
clri     cluster/
bash-3.2# cp /etc/cl
clri     cluster/
bash-3.2# cp /etc/cluster/ccr/global/infrastructure /etc/cluster/ccr/global/infr                                                                                        astructure-04092015
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2# 
bash-3.2# ls -l /etc/cluster/ccr/global/infrastructure*
-rw-------   1 root     root        5527 Sep  4 14:52 /etc/cluster/ccr/global/infrastructure
-rw-------   1 root     root        5527 Sep  4 23:53 /etc/cluster/ccr/global/infrastructure-04092015
-rw-------   1 root     root        5526 Sep  4 14:52 /etc/cluster/ccr/global/infrastructure.bak
-rw-r--r--   1 root     root           0 Sep  4 23:38 /etc/cluster/ccr/global/infrastructure.new
bash-3.2#
bash-3.2# grep access_mode /etc/cluster/ccr/global/infrastructure
cluster.quorum_devices.1.properties.access_mode scsi2
cluster.quorum_devices.2.properties.access_mode scsi3
bash-3.2#
bash-3.2# hostname
node1
bash-3.2# cat /etc/cluster/nodeid
2
bash-3.2#
bash-3.2# cat /etc/cluster/ccr/global/infrastructure |grep quorum_devices
cluster.quorum_devices.1.name   d6
cluster.quorum_devices.1.state  enabled
cluster.quorum_devices.1.properties.votecount   1
cluster.quorum_devices.1.properties.gdevname    /dev/did/rdsk/d6s2
cluster.quorum_devices.1.properties.path_1      enabled
cluster.quorum_devices.1.properties.path_2      enabled
cluster.quorum_devices.1.properties.access_mode scsi2
cluster.quorum_devices.1.properties.type        shared_disk
cluster.quorum_devices.2.name   d4
cluster.quorum_devices.2.state  enabled
cluster.quorum_devices.2.properties.votecount   1
cluster.quorum_devices.2.properties.gdevname    /dev/did/rdsk/d4s2
cluster.quorum_devices.2.properties.path_1      enabled
cluster.quorum_devices.2.properties.path_2      enabled
cluster.quorum_devices.2.properties.access_mode scsi3
cluster.quorum_devices.2.properties.type        shared_disk
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2# vi /etc/cluster/ccr/global/infrastructure
"/etc/cluster/ccr/global/infrastructure" 111 lines, 5527 characters
ccr_gennum      10
ccr_checksum    07D66A5E8801E180DA751C80AFA57DA6
cluster.name    SunCluster
cluster.state   enabled
cluster.properties.cluster_id   0x55794F42
cluster.properties.installmode  disabled
cluster.properties.private_net_number   10.240.0.0
cluster.properties.cluster_netmask      255.255.255.0
cluster.properties.private_netmask      255.255.255.128
cluster.properties.private_subnet_netmask       255.255.255.248
cluster.properties.private_user_net_number      10.240.0.64
cluster.properties.private_user_netmask 255.255.255.224
cluster.properties.private_maxnodes     6
cluster.properties.private_maxprivnets  10
cluster.properties.zoneclusters 12
cluster.properties.auth_joinlist_type   sys
cluster.properties.auth_joinlist_hostslist      .
cluster.properties.transport_heartbeat_timeout  10000
cluster.properties.transport_heartbeat_quantum  1000
cluster.properties.udp_session_timeout  480
cluster.properties.cmm_version  1
cluster.nodes.1.name    node2
cluster.nodes.1.state   enabled
cluster.nodes.1.properties.private_hostname     clusternode1-priv
cluster.nodes.1.properties.quorum_vote  1
cluster.nodes.1.properties.quorum_resv_key      0x55794F4200000001
cluster.nodes.1.adapters.1.name e1000g2
cluster.nodes.1.adapters.1.state        enabled
cluster.nodes.1.adapters.1.properties.device_name       e1000g
cluster.nodes.1.adapters.1.properties.device_instance   2
cluster.nodes.1.adapters.1.properties.transport_type    dlpi
cluster.nodes.1.adapters.1.properties.lazy_free 1
cluster.nodes.1.adapters.1.properties.dlpi_heartbeat_timeout    10000
cluster.nodes.1.adapters.1.properties.dlpi_heartbeat_quantum    1000
cluster.nodes.1.adapters.1.properties.nw_bandwidth      80
cluster.nodes.1.adapters.1.properties.bandwidth 70
cluster.nodes.1.adapters.1.properties.ip_address        10.240.0.17
cluster.nodes.1.adapters.1.properties.netmask   255.255.255.248
cluster.nodes.1.adapters.1.ports.1.name 0
cluster.nodes.1.adapters.1.ports.1.state        enabled
cluster.nodes.1.adapters.2.name e1000g3
cluster.nodes.1.adapters.2.state        enabled
cluster.nodes.1.adapters.2.properties.device_name       e1000g
/quorum_devices
cluster.nodes.2.adapters.2.properties.device_name       e1000g
cluster.nodes.2.adapters.2.properties.device_instance   3
cluster.nodes.2.adapters.2.properties.transport_type    dlpi
cluster.nodes.2.adapters.2.properties.lazy_free 1
cluster.nodes.2.adapters.2.properties.dlpi_heartbeat_timeout    10000
cluster.nodes.2.adapters.2.properties.dlpi_heartbeat_quantum    1000
cluster.nodes.2.adapters.2.properties.nw_bandwidth      80
cluster.nodes.2.adapters.2.properties.bandwidth 70
cluster.nodes.2.adapters.2.properties.ip_address        10.240.0.10
cluster.nodes.2.adapters.2.properties.netmask   255.255.255.248
cluster.nodes.2.adapters.2.state        enabled
cluster.nodes.2.adapters.2.ports.1.name 0
cluster.nodes.2.adapters.2.ports.1.state        enabled
cluster.nodes.2.cmm_version     1
cluster.cables.1.properties.end1        cluster.nodes.2.adapters.1.ports.1
cluster.cables.1.properties.end2        cluster.nodes.1.adapters.1.ports.1
cluster.cables.1.state  enabled
cluster.cables.2.properties.end1        cluster.nodes.2.adapters.2.ports.1
cluster.cables.2.properties.end2        cluster.nodes.1.adapters.2.ports.1
cluster.cables.2.state  enabled
cluster.quorum_devices.1.name   d6
cluster.quorum_devices.1.state  enabled
cluster.quorum_devices.1.properties.votecount   1
cluster.quorum_devices.1.properties.gdevname    /dev/did/rdsk/d6s2
cluster.quorum_devices.1.properties.path_1      enabled
cluster.quorum_devices.1.properties.path_2      enabled
cluster.quorum_devices.1.properties.access_mode scsi2
cluster.quorum_devices.1.properties.type        shared_disk
~
~
~
~
~
~
~
~
~
~
~
~
~
~
~
"/etc/cluster/ccr/global/infrastructure" 103 lines, 5134 characters
bash-3.2#
bash-3.2#
bash-3.2# cat /etc/cluster/ccr/global/infrastructure |grep quorum_devices
cluster.quorum_devices.1.name   d6
cluster.quorum_devices.1.state  enabled
cluster.quorum_devices.1.properties.votecount   1
cluster.quorum_devices.1.properties.gdevname    /dev/did/rdsk/d6s2
cluster.quorum_devices.1.properties.path_1      enabled
cluster.quorum_devices.1.properties.path_2      enabled
cluster.quorum_devices.1.properties.access_mode scsi2
cluster.quorum_devices.1.properties.type        shared_disk
bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2# /usr/cluster/lib/sc/ccradm recover -o infrastructure
Invalid ccr table file infrastructure (No such file or directory)
bash-3.2#
bash-3.2# 
bash-3.2# /usr/cluster/lib/sc/ccradm recover -o /etc/cluster/ccr/global/infrastructure
bash-3.2#
bash-3.2# reboot -- -r
login as: root
Using keyboard-interactive authentication.
Password:
Last login: Fri Sep  4 23:49:55 2015 from 192.168.11.85
Oracle Corporation      SunOS 5.10      Generic Patch   January 2005
# bash
bash-3.2#
bash-3.2#
bash-3.2# Sep  5 00:08:24 node1 sendmail[826]: [ID 702911 mail.alert] unable to qualify my own domain name (node1) -- using short name
Sep  5 00:08:24 node1 sendmail[819]: [ID 702911 mail.alert] unable to qualify my own domain name (node1) -- using short name
 bash-3.2# cldev status
 === Cluster DID Devices ===
 Device Instance               Node              Status
---------------               ----              ------
/dev/did/rdsk/d1              node2             Ok
 /dev/did/rdsk/d10             node1             Ok
                              node2             Ok
 /dev/did/rdsk/d11             node2             Ok
 /dev/did/rdsk/d12             node1             Ok
 /dev/did/rdsk/d14             node1             Ok
 /dev/did/rdsk/d3              node2             Ok
 /dev/did/rdsk/d4              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d5              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d6              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d7              node1             Ok
                              node2             Ok
 /dev/did/rdsk/d8              node2             Ok
 /dev/did/rdsk/d9              node1             Ok
                              node2             Ok
 bash-3.2#
bash-3.2#
bash-3.2# clq status
 === Cluster Quorum ===
 --- Quorum Votes Summary from latest node reconfiguration ---
             Needed   Present   Possible
            ------   -------   --------
            2        2         3
 
--- Quorum Votes by Node (current status) ---
 Node Name       Present       Possible       Status
---------       -------       --------       ------
node2           1             1              Online
node1           1             1              Online
 
--- Quorum Votes by Device (current status) ---
 Device Name       Present      Possible      Status
-----------       -------      --------      ------
d6                1            1             Online
 bash-3.2#
bash-3.2#
bash-3.2#
bash-3.2# cluster show|grep global_fencing
  global_fencing:                                  prefer3
bash-3.2#
bash-3.2#
bash-3.2# cldev show d6 |grep default_fencing
  default_fencing:                                 pathcount
bash-3.2#
bash-3.2# cldev show d4 |grep default_fencing
  default_fencing:                                 global
bash-3.2#
bash-3.2# clq show d4
clq:  (C952776) Device "d4" does not exist.
bash-3.2# clq show d6
 === Quorum Devices ===
 Quorum Device Name:                             d6
  Enabled:                                         yes
  Votes:                                           1
  Global Name:                                     /dev/did/rdsk/d6s2
  Type:                                            shared_disk
  Access Mode:                                     scsi2
  Hosts (enabled):                                 node2, node1
 bash-3.2#
bash-3.2#
bash-3.2# clq remove d6
clq:  (C115344) Cluster quorum could be compromised if you remove "d6".
bash-3.2#
bash-3.2# cluster set -p installmode=enabled
bash-3.2#
bash-3.2# clq remove d6
bash-3.2#
bash-3.2# clq status
 === Cluster Quorum ===
 --- Quorum Votes Summary from latest node reconfiguration ---
             Needed   Present   Possible
            ------   -------   --------
            2        2         2
 
--- Quorum Votes by Node (current status) ---
 Node Name       Present       Possible       Status
---------       -------       --------       ------
node2           1             1              Online
node1           1             1              Online
 bash-3.2#
bash-3.2# cldev show d6 |grep default_fencing
  default_fencing:                                 pathcount
bash-3.2#
bash-3.2# cldev set -p default_fencing=global d6
Updating shared devices on node 1
Updating shared devices on node 2
bash-3.2#
bash-3.2# cldev show d6 |grep default_fencing
  default_fencing:                                 global
bash-3.2#
bash-3.2#
bash-3.2# clq add d6

Again after running last command that got hanged & didn't got any # prompt.

I need to shutdown abruptly both the nodes again.

Kindly let me know how can I make d6(quorum device) as scsi3 reservation

I've read your post a few times but find it difficult to determine exactly what's happening. However, without a doubt, although you might find it unavoidable to "abruptly shutdown" the cluster, for scsi3 LUNs this is bad news.

In scsi2 reserve/release the reservation(s) are cleared with a scsi reset (ie, a reboot) but in scsi3 the reservations are persistent and survive a reboot. Therefore such reservations need to be cleared somehow.

Read this:
Sun Cluster: Cluster Reservations and Quorum

I cannot be sure that this is your issue but "abruptly" shutting down a cluster with scsi3 LUNs should be avoided if at all possible.

Thanks for the input Hicksb8. I am doing this on my practice cluster luns are iscsi based.

I am just trying to implement the SCSI3 reservation on two path devices in cluster.

In first logs I just wanted to change the global_ default fencing to prefer3 mean scsi3 where it change all the reservation of shared disks from scsi2 to scsi3 except quorum device d6 which have I have to change it manually by adding d4 device as quorum (while adding this d4 device(Scsi3) in cluster, my command got hanged ). If my command didn't got hanged & went smoothly then remove the d6 from quorum device control then change the default_fencing parameter of device d6 to global from pathcount . After this I wanted to add the d6 in quorum device. & remove the d4 from quorum control. As in two node cluster we cannot directly remove the d6 (single)quorum device due to quorum of the server will be compromised untill & unless we put the cluster in installedmode which I did in seconds logs.

In both the above logs/procedure you will notice I am trying to add disk which is having scsi 3 reservation. Is due to adding the scsi3 reservation disk in quorum control my cluster got hanged both the times in above mention both logs meaning even simple "clq status" got hanged then I need to take the cluster reboot . As per my knowledge only these two process are to implement scsi3 reservation on two path devices.

My cluster is 3.2 & kindly let me know is their any way to implement scsi3 reservation on two path devices including quorum device.

What make/model of iSCSI SAN box is it?

Adding a Quorum Device (Sun Cluster System Administration Guide for Solaris OS)

I guess it depends whether Oracle support your iSCSI SAN.

Thanks hicksd8 for the link.

Now even I have been thinking that due to my iscsi setup in cluster is not supporting scsi 3 reservation
As you notice my inital logs I am able to add another LUN as quorum device in two node cluster mean LUN based on scsi2 reservation.

Currently I am using iscsitadm command to create LUNs on solaris machine & mapping those LUNS to two solaris nodes.

Kindly share your further view on my current setup

It depends on whether your SAN controller supports SCSI3 reservation irrespective of whether you have SCSI3 disks in the array. There may be a manufacturer's firmware upgrade.

You didn't answer my question as to what make/model the SAN is?

(Do you have two userid's on this forum? I started a dialogue with OP 'sb200' but now being answered by 'amity'.)

Thank for reply Hickd8

In my setup , I am using T5240 server as storage using iscsitadm command creating LUNs & T2000 server as two nodes.

For you information, amity user is used by my real brother & we both are working on the same issue

A little research reveals that Oracle (Sun) partner with COMSTAR for the iSCSI connectivity and this page from 2013 says that (at that time) scsi3 PGR was not supported.

How to Configure ISCSI target in Solaris 11 ? - UnixArena

However, there are also a number of comments on the web saying that COMSTAR were about to implement PGR in an update so this is probably available now. Perhaps you might try to download it.

1 Like

Hicksd8

Many thanks for your views & help on this issue.