VxVM..anything to worry about in here..

DEVICE TYPE DISK GROUP STATUS
c0t0d0s2 sliced rootdisk rootdg online
c1t1d0s2 sliced disk01 rootdg online
c2t0d0s2 sliced actsvr101 actsvr1dg online
c2t2d0s2 sliced actsvr102 actsvr1dg online
c2t3d0s2 sliced actsvr103 actsvr1dg online failing
c2t6d0s2 sliced actsvr107 actsvr1dg online spare
c2t16d0s2 sliced - - online
c2t19d0s2 sliced - - online
c2t20d0s2 sliced - - online
c2t22d0s2 sliced - - online
c3t32d0s2 sliced actsvr104 actsvr1dg online
c3t33d0s2 sliced - - error
c3t34d0s2 sliced actsvr105 actsvr1dg online
c3t35d0s2 sliced actsvr106 actsvr1dg online
c3t38d0s2 sliced - - online
c3t48d0s2 sliced - - online
c3t51d0s2 sliced - - online
c3t54d0s2 sliced - - online

AVAILABLE DISK SELECTIONS:
0. c0t0d0 <SUN4.2G cyl 3880 alt 2 hd 16 sec 135>
/sbus@3,0/SUNW,fas@3,8800000/sd@0,0
1. c1t1d0 <SUN4.2G cyl 3880 alt 2 hd 16 sec 135>
/sbus@2,0/SUNW,fas@2,8800000/sd@1,0
2. c2t0d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0
3. c2t2d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0
4. c2t3d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0
5. c2t6d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037169bb2,0
6. c2t16d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037169dd4,0
7. c2t19d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037169058,0
8. c2t20d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020370a2c20,0
9. c2t22d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371699df,0
10. c3t32d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203713fd73,0
11. c3t33d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203709c481,0
12. c3t34d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w2100002037162a85,0
13. c3t35d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203716fd9e,0
14. c3t38d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w2100002037169979,0
15. c3t48d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w2100002037143dfd,0
16. c3t51d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w210000203716982d,0
17. c3t54d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/sbus@2,0/SUNW,socal@d,10000/sf@1,0/ssd@w2100002037169169,0

<output truncated>
Dec 12 17:12:46 actsvr1 unix:
Dec 12 17:12:46 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 12 17:12:46 actsvr1 unix: SCSI transport failed: reason 'tran_err': retrying command
Dec 12 17:12:46 actsvr1 unix:
Dec 12 17:12:46 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0 (ssd2):
Dec 12 17:12:46 actsvr1 unix: SCSI transport failed: reason 'tran_err': retrying command
Dec 12 17:12:46 actsvr1 unix:
Dec 12 17:12:46 actsvr1 unix: sf0: Target 0x0 Reset failed.Abort Failed
Dec 12 17:12:46 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 12 17:12:46 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 17:12:46 actsvr1 unix:
Dec 12 17:12:48 actsvr1 unix: ID[SUNWssa.socal.link.6010] socal0: port 0: Fibre Channel Loop is ONLINE
Dec 12 17:23:06 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0 (ssd1):
Dec 12 17:23:06 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 17:23:06 actsvr1 unix:
Dec 12 17:24:41 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0 (ssd1):
Dec 12 17:24:41 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 17:24:41 actsvr1 unix:
Dec 12 17:25:19 actsvr1 unix: ID[SUNWssa.socal.link.5010] socal0: port 0: Fibre Channel is OFFLINE
Dec 12 17:25:19 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 12 17:25:19 actsvr1 unix: SCSI transport failed: reason 'tran_err': retrying command
Dec 12 17:25:19 actsvr1 unix:
Dec 12 17:25:19 actsvr1 unix: ID[SUNWssa.socal.link.6010] socal0: port 0: Fibre Channel Loop is ONLINE
Dec 12 17:25:20 actsvr1 unix: sf0: ELS 0x0 to target 0x1d retrying
Dec 12 17:25:28 actsvr1 sm_egd[479]: Got an error: <AIL error in /opt/SUNWsymon/sbin/sm_egd: Connection failed for ConfigReader (RPC=1073879826) on actsvr1 (130.1.11.1)>
Dec 12 17:25:29 actsvr1 sm_configd[736]: unidentified board type 1
Dec 12 17:26:26 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0 (ssd1):
Dec 12 17:26:26 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 17:26:26 actsvr1 unix:
Dec 12 17:31:26 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0 (ssd2):
Dec 12 17:31:26 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 17:31:26 actsvr1 unix:
Dec 12 17:32:36 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 12 17:32:36 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 17:32:36 actsvr1 unix:
Dec 12 17:36:03 actsvr1 unix: ID[SUNWssa.socal.link.5010] socal0: port 0: Fibre Channel is OFFLINE
Dec 12 17:36:03 actsvr1 unix: ID[SUNWssa.socal.link.6010] socal0: port 0: Fibre Channel Loop is ONLINE
Dec 12 17:36:04 actsvr1 unix: sf0: ELS 0x0 to target 0x1d retrying
Dec 12 18:10:07 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0 (ssd1):
Dec 12 18:10:07 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 12 18:10:07 actsvr1 unix: Requested Block: 6029596 Error Block: 6029596
Dec 12 18:10:07 actsvr1 unix: Vendor: SEAGATE Serial Number: 9848X66445
Dec 12 18:10:07 actsvr1 unix: Sense Key: Aborted Command
Dec 12 18:10:07 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 12 18:11:21 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0 (ssd2):
Dec 12 18:11:21 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 18:11:21 actsvr1 unix:
Dec 12 19:11:24 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0 (ssd1):
Dec 12 19:11:24 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 19:11:24 actsvr1 unix:
Dec 12 19:12:24 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0 (ssd1):
Dec 12 19:12:24 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 19:12:24 actsvr1 unix:
Dec 12 19:12:24 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0 (ssd2):
Dec 12 19:12:24 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 12 19:12:24 actsvr1 unix: Requested Block: 410999 Error Block: 410999
Dec 12 19:12:24 actsvr1 unix: Vendor: SEAGATE Serial Number: 9844X51203
Dec 12 19:12:24 actsvr1 unix: Sense Key: Aborted Command
Dec 12 19:12:24 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 12 19:14:59 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 12 19:14:59 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 19:14:59 actsvr1 unix:
Dec 12 19:28:39 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w21000020371661db,0 (ssd1):
Dec 12 19:28:39 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 19:28:39 actsvr1 unix:
Dec 12 20:10:03 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 12 20:10:04 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 12 20:10:04 actsvr1 unix: Requested Block: 2154730 Error Block: 2154730
Dec 12 20:10:04 actsvr1 unix: Vendor: SEAGATE Serial Number: 9844X44338
Dec 12 20:10:04 actsvr1 unix: Sense Key: Aborted Command
Dec 12 20:10:04 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 12 20:10:04 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0 (ssd2):
Dec 12 20:10:04 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 12 20:10:04 actsvr1 unix: Requested Block: 2154714 Error Block: 2154714
Dec 12 20:10:04 actsvr1 unix: Vendor: SEAGATE Serial Number: 9844X51203
Dec 12 20:10:04 actsvr1 unix: Sense Key: Aborted Command
Dec 12 20:10:04 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 12 20:10:05 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037169bb2,0 (ssd5):
Dec 12 20:10:05 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 12 20:10:05 actsvr1 unix: Requested Block: 13239 Error Block: 13191
Dec 12 20:10:05 actsvr1 unix: Vendor: SEAGATE Serial Number: 9848X81990
Dec 12 20:10:05 actsvr1 unix: Sense Key: Aborted Command
Dec 12 20:10:05 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 12 22:56:41 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 12 22:56:41 actsvr1 unix: SCSI transport failed: reason 'timeout': retrying command
Dec 12 22:56:41 actsvr1 unix:
Dec 13 00:05:02 actsvr1 sm_configd[1066]: unidentified board type 1
Dec 13 00:10:03 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037169bb2,0 (ssd5):
Dec 13 00:10:03 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 13 00:10:03 actsvr1 unix: Requested Block: 7399 Error Block: 7399
Dec 13 00:10:03 actsvr1 unix: Vendor: SEAGATE Serial Number: 9848X81990
Dec 13 00:10:03 actsvr1 unix: Sense Key: Aborted Command
Dec 13 00:10:03 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 13 00:10:04 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0 (ssd2):
Dec 13 00:10:04 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 13 00:10:04 actsvr1 unix: Requested Block: 2205162 Error Block: 2205162
Dec 13 00:10:04 actsvr1 unix: Vendor: SEAGATE Serial Number: 9844X51203
Dec 13 00:10:04 actsvr1 unix: Sense Key: Aborted Command
Dec 13 00:10:04 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 13 00:10:12 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162e14,0 (ssd2):
Dec 13 00:10:12 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 13 00:10:12 actsvr1 unix: Requested Block: 2205466 Error Block: 2205466
Dec 13 00:10:12 actsvr1 unix: Vendor: SEAGATE Serial Number: 9844X51203
Dec 13 00:10:12 actsvr1 unix: Sense Key: Aborted Command
Dec 13 00:10:12 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 13 02:10:02 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 13 02:10:02 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 13 02:10:02 actsvr1 unix: Requested Block: 2151514 Error Block: 2151514
Dec 13 02:10:02 actsvr1 unix: Vendor: SEAGATE Serial Number: 9844X44338
Dec 13 02:10:02 actsvr1 unix: Sense Key: Aborted Command
Dec 13 02:10:02 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 13 02:10:03 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037169bb2,0 (ssd5):
Dec 13 02:10:03 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 13 02:10:03 actsvr1 unix: Requested Block: 7815 Error Block: 7815
Dec 13 02:10:03 actsvr1 unix: Vendor: SEAGATE Serial Number: 9848X81990
Dec 13 02:10:03 actsvr1 unix: Sense Key: Aborted Command
Dec 13 02:10:03 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3
Dec 13 03:43:21 actsvr1 unix: WARNING: /sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@w2100002037162f1a,0 (ssd4):
Dec 13 03:43:21 actsvr1 unix: Error for Command: write(10) Error Level: Retryable
Dec 13 03:43:22 actsvr1 unix: Requested Block: 741395 Error Block: 741395
Dec 13 03:43:22 actsvr1 unix: Vendor: SEAGATE Serial Number: 9844X44338
Dec 13 03:43:22 actsvr1 unix: Sense Key: Aborted Command
Dec 13 03:43:22 actsvr1 unix: ASC: 0x47 (scsi parity error), ASCQ: 0x0, FRU: 0x3

It looks from the vxdisk list like actsvr103 is failing, but it is hard to tell from this if the cause of the problem is the fibre loop or the disk itself. Given that it appear to be affecting multiple disks would lean towards suspecting the fibre loop sbus controller or array controller ( if you have an array configured as JBOD ) or possibly just a failing port/transceiver since they all the disks appear to be on one fibre transceiver.

How can we conclude what's the problem?
I have the explorer from the system. So whatever outputs you want, I can provide the information. Kindly assist me out. Im stuck.

What can you do in terms of availability? Is it ok to bring the machine down or is it in production? I don't want to suggest anything that will cause you "manager problems".

Also what is the exact setup and what parts of the setup do you (currently) have replacement parts for? It look like an old enterprise machine with A1000/A5000 series arrays.

looks like an a5200 array. there are 16x9gb disks and two controllers. so it looks like an split bus setup. and it's rather old cause the machine has socal and not fcal adapters... on the other hand it can be a a1000 with mapped volumes... the target numbers maybe an indicator for volumes from a hardware raid system...

I agree. Your analysis is pretty much the same as mine.

Its a Production system, connected to a D1000 Array.
No worries about the availibility of parts. We will want a resolution so that we can follow up on that as soon.
And if there should be a downtime required, let me know with the action plan so that we can proceed.

Well a non-diagnostic solution would be to replace the affected socal card and the controller in the d1000 to which it connects. The diagnostic option would be do one at a time and see which actually causes the problem. If you have capacity you could add another array and see if you can get mirrored across the two arrays, but that will depend on how much data you can transfer before Veritas decides disks are bad becasue of errors.

I should however point out here that there is a less likely cause for this problem, which is a Veritas software corruption. That scenario has happed to me only once in the past 10 years also resulting in disks get marked as failing, in every other case it has been hardware.

Hi reborg, what you meant by socal card ?

socal was before fcal hbas...

Ok thanks. Will make the necessary arrangements for that.

Well just to be pedantic, technically SOCAL is just a variant of FCAL as opposed to SOC which is not.

For reference should anyone be looking for the information later.
SOCAL - Serial Optical Controller for Fibre Channel Arbitrated Loop

Im not sure if SUN still have that card with them. Need to check.

I found this Sun developers link useful:

Device Mapping on Sun Servers: A Quick Guide