Failed mpio path on AIX5.3

I have found failed mpio path on AIX and enabled that failed path as below

failed path on AIX
bash-3.2# lspath -l hdisk10
Enabled hdisk10 fscsi0
Enabled hdisk10 fscsi0
Failed  hdisk10 fscsi3
Enabled hdisk10 fscsi3

Enabled failed path as below

chpath -l hdisk10 -p fscsi3 -s enable -w5005076802408847,8000000000000

However it goes Enabled, but a few minutes later shows Failed ,could some one help me on this.
Please let me know above steps, if i miss anything and why it has failed.

Hi,

it looks like that the specific path is not (longer) available.
Maybe your storage admin has removed the mapping for this volume or the zoning for this path has been removed.

An lspath and lscfg -vl hdisk10 would be helpfull.

Regards

I have run the lscfg command and find the below output


 hdisk10          U78AA.001.WZSHBCV-P1-C6-T2-W5005076802408847-L8000000000000  MPIO IBM 2076 FC Disk

        Manufacturer................IBM
        Machine Type and Model......2145
        ROS Level and ID............30303030
        Serial Number...............2076
        Device Specific.(Z0)........0000063268181002
        Device Specific.(Z1)........
        Device Specific.(Z2)........
        Device Specific.(Z3)........


Is there anyway to get info for  failed zoneing aix os level

Looking at the error, it seems SAN problem (as Xray said).
Also, I am seeing you have the physical card connected to the system. ( So NPIV or VSCSI is ruled out and so is looking at VIO for the same).

Run the below command
lspath -l hdisk10 -s available -F"connection:parent:path_status:status"

This will give you exactly which connection at storage (fibre adapter - FA) failed.
One connection from your (fscsi3) port is going to Fabric and it has two connections (from Fabric to Storage. So one from fabric to Storage is down).

Also, what storage you have?

Send the output of
lsdev -Ccdisk

Hope this helps!

using IBM v7000 storage.

Unable to find the as below host mappings in storage end

5005076802308847,8000000000000:fscsi0:Available:Enabled
5005076802308846,8000000000000:fscsi0:Available:Enabled
5005076802408847,8000000000000:fscsi3:Available:Failed
5005076802408846,8000000000000:fscsi3:Available:Enabled

Please help me any other way to find this.

---------- Post updated at 08:47 AM ---------- Previous update was at 08:26 AM ----------


Lsdev -Ccdisk info as below

hdisk0  Available 00-08-00 SAS Disk Drive
hdisk1  Available 00-08-00 SAS Disk Drive
hdisk2  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk3  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk4  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk5  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk6  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk7  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk8  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk9  Available 05-01-01 MPIO IBM 2076 FC Disk
hdisk10 Available 05-01-01 MPIO IBM 2076 FC Disk

The Mapping normaly points to the MAC-adress of your FC-Adapter.

Use the following command to get the MAC-adress:
lscfg -vl fcs3 | grep -i "Network Address"

find the below Network Address
Network Address.............10000000C8D6A602

I'am not familar with IBM v7000 storage.

Please ask your storage Admin, whether all mappings for this MAC-adress/WWN and for hdisk10 (LUN ID 8) are present.

LUN ID (in hex):
lsattr -El hdisk10 | grep -i "lun_id"

Mapping is done at host level (on v7000), if he is missing one WWN, that affect should be global and NOT to a particular LUN.

@Murali969, the failed path is from Storage, can you cross verify on all other hdisks the path is up? is it showing Failed on the below for other hdisks too?

5005076802408847

If that's the case, then it could be that they just zoned three and missed out on (50050XXX).

Ask the storage team to look in to Fabric again.

Can you paste the output of
fcstat fcs3 | grep -p "FC SCSI Traffic Statistics"

Also, send me the errpt -a for that failed path if it is logged.

All other hdisks are fine

The fcstat info as below
FC SCSI Traffic Statistics
  Input Requests:   1182
  Output Requests:  276
  Control Requests: 2762594
  Input Bytes:  264308
  Output Bytes: 6624


The errpt output is as below

LABEL:          SC_DISK_PCM_ERR8
IDENTIFIER:     02A8BC99

Date/Time:       Fri Nov 15 23:55:06 2013
Sequence Number: 415724
Machine Id:      00F973B152C00
Node Id:         dr-asm-01
Class:           H
Type:            PERM
WPAR:            Global
Resource Name:   hdisk10
Resource Class:  disk
Resource Type:   mpioosdisk
Location:        U78AA.001.WZSHBCV-P1-C6-T2-W5005076802408847-L8000000000000

VPD:
        Manufacturer................IBM
        Machine Type and Model......2145
        ROS Level and ID............30303030
        Serial Number...............2076
        Device Specific.(Z0)........0000063268181002
        Device Specific.(Z1)........
        Device Specific.(Z2)........
        Device Specific.(Z3)........

Description
PATH HAS FAILED

Probable Causes
ADAPTER HARDWARE OR CABLE
DASD DEVICE

Failure Causes
UNDETERMINED

        Recommended Actions
        PERFORM PROBLEM DETERMINATION PROCEDURES
        CHECK PATH

Detail Data
PATH ID
           2
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0008


Hmmm,
So, we know it's not the problem with your fcs3, and looking at the output of your lsdev -Ccdisk , I am assuming you don't have MPIO pcmpath installed on your system.

With v7000 storage, IBM MPIO- SDDPCM is free.

Can you paste the output of

lsattr -El hdisk10 -a "algorithm hcheck_interval reserve_policy"
lsattr -El fscsi3 -a "dyntrk fc_err_recov"

Do this ONLY, if the disk is free:
Is the disk in use? lspv | grep hdisk10 , if its not in use try to delete the disk and re-configure.

rmdev -Rdl hdisk10
cfgmgr -i hdisk10
lspath -l hdisk10 -s available -F"connection:parent:path_status:status"


Please find  the below outputs

# lsattr -El hdisk10 -a "algorithm hcheck_interval reserve_policy"

algorithm       fail_over   Algorithm             True
hcheck_interval 60          Health Check Interval True
reserve_policy  single_path Reserve Policy        True

bash-3.2# lsattr -El fscsi3 -a "dyntrk fc_err_recov"
dyntrk       yes       Dynamic Tracking of FC Devices        True
fc_err_recov fast_fail FC Fabric Event Error RECOVERY Policy True



rmdev -Rdl hdisk10
cfgmgr -i hdisk10
lspath -l hdisk10 -s available -F"connection:parent:path_status:status"

Have you tried this?

1 Like
Thanks a lot on above, iam in the process of doing..