IO queue maxtime over 1 minute

veslar · December 11, 2017, 6:31am

We are experiencing around 1minute IO delays on Oracle DB including both read and write operations even for writing <1kB or reading single 8kB block. Below is the output of iostat -D -T 60 1000 for the problematic time periods

asmdata03       xfer:  %tm_act      bps      tps      bread      bwrtn      time
                         99.3    334.6K     1.4      331.1K       3.5K  20:17:10
                read:      rps  avgserv  minserv  maxserv   timeouts      fails
                          1.3      9.5      1.0     33.0           0          0
               write:      wps  avgserv  minserv  maxserv   timeouts      fails
                          0.2      0.9      0.7      1.3           0          0
               queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                          8.9S     0.0      1.1M     0.0       11.0         0.0
asmdata08       xfer:  %tm_act      bps      tps      bread      bwrtn      time
                         99.5    518.4K     2.9      511.2K       7.2K  20:22:10
                read:      rps  avgserv  minserv  maxserv   timeouts      fails
                          1.9     13.7      0.9     49.2           0          0
               write:      wps  avgserv  minserv  maxserv   timeouts      fails
                          0.9      0.7      0.9      1.6           0          0
               queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                          6.0S     0.0      1.1M     0.0       15.0         0.1
asmdata17       xfer:  %tm_act      bps      tps      bread      bwrtn      time
                         86.5      4.7M    18.7        4.7M       2.0K  22:25:10
                read:      rps  avgserv  minserv  maxserv   timeouts      fails
                         18.1     15.2      0.1     69.9           0          0
               write:      wps  avgserv  minserv  maxserv   timeouts      fails
                          0.6      1.3      0.7      8.4           0          0
               queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                        696.0      0.0      1.1M     0.0       10.0         0.0
asmdata10       xfer:  %tm_act      bps      tps      bread      bwrtn      time
                         84.2     29.4M   113.1       29.4M     546.1   22:35:10
                read:      rps  avgserv  minserv  maxserv   timeouts      fails
                        113.0      1.5      0.1      7.7           0          0
               write:      wps  avgserv  minserv  maxserv   timeouts      fails
                          0.1      0.9      0.0      0.0           0          0
               queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                         10.6      0.0      1.1M     0.0        1.0         0.0
asmdata15       xfer:  %tm_act      bps      tps      bread      bwrtn      time
                         92.2      6.4M    31.2        6.4M     136.5   22:40:10
                read:      rps  avgserv  minserv  maxserv   timeouts      fails
                         31.2     16.4      0.2     96.8           0          0
               write:      wps  avgserv  minserv  maxserv   timeouts      fails
                          0.0      1.5      1.5      1.5           0          0
               queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                        530.0      0.0      1.1M     0.0       13.0         0.6
asmdata06       xfer:  %tm_act      bps      tps      bread      bwrtn      time
                         93.0      6.7M    26.4        6.7M       3.7K  22:45:10
                read:      rps  avgserv  minserv  maxserv   timeouts      fails
                         25.6     18.8      0.4     82.0           0          0
               write:      wps  avgserv  minserv  maxserv   timeouts      fails
                          0.8      1.0      0.6      6.6           0          0
               queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                        632.2      0.0      1.1M     0.0       13.0         0.5
asmdata05       xfer:  %tm_act      bps      tps      bread      bwrtn      time
                         76.9     20.7M    79.0       20.7M     273.1   06:59:11
                read:      rps  avgserv  minserv  maxserv   timeouts      fails
                         79.0     14.9      0.5     65.9           0          0
               write:      wps  avgserv  minserv  maxserv   timeouts      fails
                          0.0      0.6      0.0      0.0           0          0
               queue:  avgtime  mintime  maxtime  avgwqsz    avgsqsz     sqfull
                        216.5      0.0      1.1M     0.0       11.0         1.5

Does anyone have an idea what could be wrong or what to check further?

thanks a lot

rbatte1 · December 11, 2017, 8:47am

It depends.

What other activity is on the server at the time? Sadly, this could be many things.

A few possibles:-

Missing/invalid index causing a full table scan and/or locks
If there is NFS as part of the process then that is a contender for contention
If the queries are remote, there could be DNS issues (i.e. server cannot resolve source IP address)
Disk access IO issues (check paths for SCSI disks or SAN provided disk)

There will be plenty more and I've probably not guessed right but we might be lucky. If I think of any more, I will update the list or post again.

Have you ruled anything out yet?

I hope that these get you started,
Robin

veslar · January 5, 2018, 12:43am

errpt -a -s 0102000018

shows these kind of errors when those 60sec IO delays are occurring:

LABEL:          SC_DISK_ERR7
IDENTIFIER:     DE3B8540

Date/Time:       Tue Jan  2 18:11:57 CET 2018
Sequence Number: 18526
Machine Id:      00FB0E544C00
Node Id:         xicdbl01a
Class:           H
Type:            PERM
WPAR:            Global
Resource Name:   asmdata14       
Resource Class:  disk
Resource Type:   2145
Location:        U78C7.001.KIC4358-P1-C9-T1-W500507680C238F26-L17000000000000

VPD:             
        Manufacturer................IBM     
        Machine Type and Model......2145            
        ROS Level and ID............0000
        Device Specific.(Z0)........0000063268181002
        Device Specific.(Z1)........0203202
        Serial Number...............600507680C808470680000000000007D

Description
PATH HAS FAILED

Probable Causes
ADAPTER HARDWARE OR CABLE
DASD DEVICE

Failure Causes
UNDETERMINED

    Recommended Actions
    PERFORM PROBLEM DETERMINATION PROCEDURES
    CHECK PATH

Detail Data
PATH ID
           0
SENSE DATA
0A00 2800 0040 6000 0002 0004 0000 0000 0000 0000 0000 0000 0200 0300 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
---------------------------------------------------------------------------
LABEL:          SC_DISK_ERR4
IDENTIFIER:     DCB47997

Date/Time:       Tue Jan  2 18:11:57 CET 2018
Sequence Number: 18529
Machine Id:      00FB0E544C00
Node Id:         xicdbl01a
Class:           H
Type:            TEMP
WPAR:            Global
Resource Name:   asmdata14       
Resource Class:  disk
Resource Type:   2145
Location:        U78C7.001.KIC4358-P1-C9-T1-W500507680C238F26-L17000000000000

VPD:             
        Manufacturer................IBM     
        Machine Type and Model......2145            
        ROS Level and ID............0000
        Device Specific.(Z0)........0000063268181002
        Device Specific.(Z1)........0203202
        Serial Number...............600507680C808470680000000000007D

Description
DISK OPERATION ERROR

Probable Causes
MEDIA
DASD DEVICE

User Causes
MEDIA DEFECTIVE

    Recommended Actions
    FOR REMOVABLE MEDIA, CHANGE MEDIA AND RETRY
    PERFORM PROBLEM DETERMINATION PROCEDURES

Failure Causes
MEDIA
DISK DRIVE

    Recommended Actions
    FOR REMOVABLE MEDIA, CHANGE MEDIA AND RETRY
    PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
PATH ID
           0
SENSE DATA
0A00 2800 0040 6000 0002 0004 0000 0000 0000 0000 0000 0000 0200 0300 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 
0000 0000 0000 06E0 F73A 0008 7D80 0000 0000 0000 0000 0000 0000 0000 0007 0000 
0002 0035 001D 
---------------------------------------------------------------------------
LABEL:          SDDPCM_PATH_RECOVER
IDENTIFIER:     67150733

Date/Time:       Tue Jan  2 18:13:57 CET 2018
Sequence Number: 18530
Machine Id:      00FB0E544C00
Node Id:         xicdbl01a
Class:           O
Type:            INFO
WPAR:            Global
Resource Name:   asmdata14       

Description
PATH HAS RECOVERED

Probable Causes
ADAPTER HARDWARE OR CABLE
DASD DEVICE

Failure Causes
EF00

    Recommended Actions
    PERFORM PROBLEM DETERMINATION PROCEDURES

Detail Data
PATH ID
          18
SENSE DATA
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 8000 0014 0000 
001B 0000 0003 0D96 FB04 0000 001E 0000 0000 0000 0011 0000 1000 0000 0005 0000 
0003 0000 8000 0000 5C30 F100 0A00 278A 4A40

MadeInGermany · January 5, 2018, 3:34am

It looks like there are occasional sector read errors (where the retry worked).
If the disk is old then it is going bad - replacement needed.
If the disk is new then there might be a bug in the disk firmware (or in the disk driver in the kernel).

veslar · January 5, 2018, 7:32am

The disk array is brand new, not yet in production. Couldn't it be on SAN? Any ideas how to get more diagnostic info?

lspath -F "path_id status path_status parent connection"

output is reporting failed paths just after the delay has occurred > > >

Fri Jan 5 10:32:14 CET 2018
1  Failed    Available fscsi0 500507680c138f26,19000000000000
3  Failed    Available fscsi1 500507680c148f26,19000000000000
Fri Jan 5 11:47:16 CET 2018
8  Failed    Available fscsi2 500507680c238f26,e000000000000
1  Failed    Available fscsi3 500507680c248f26,e000000000000

thx

Peasant · January 5, 2018, 7:39am

Not an AIX export but you could try to...

Failover a path to working path only (is possible dunno in aix, probably is).
Change the cable for faulty path.

I've debuged HPUX boxes with 'bad' cables experiencing exactly the same issues, long IO waits.

As i failed over to healthy path via scsimgr the machine got back to life.
Initially, the boot time was 20/30 minutes, dropped to normal after path switch.
This was VPAR on FC disk.

At the end i went to system room, pulled the 20 M fibre cable across data center to replace/diagnose the faulty cable.
Everything worked well, so the bad cable was replaced properly afterward.

Probably a factory error or someone damaged the cable during installation.

Of course, check the storage side and FC port switch to which the server is connected.

Hope that helps
Regards
Peasant.

agent.kgb · January 6, 2018, 5:53pm

you don't show all the errors. SC_DISK_ERRs, you see, are SCSI timeout. They can be caused by SAN bottlenecks (such as lack of buffers) as well as by AIX settings. Storage device can cause such problems too. The easiest (and I think the fastest) way to solve the problem is to open a PerfPMR at IBM. You have IBM server, IBM operating system and IBM storage. They're interested in solving the problem as fast as possible.

If you'd like to analyze the problem further, you must first provide the information about the operating system and the storage.

which version do you have? oslevel -s
which version of storage drivers?
settings of the FC adapters? ( for i in $(lsdev -l fcs* -F name) ; do echo $i ; lsattr -El $i ; lscfg -l $i; done
settings of FSCSI devices? lsattr -El fscsiX
Output of fcstat command for each FC HBA
full errpt output, because you provided just a short excerpt

It is just a small amount of questions I can remember after celebrating russian Christmas. IBM will ask more questions for sure, and will ask to run some scripts to gather the necessary information.