I have a IBM Power9 server coupled with a NVMe StorWize V7000 GEN3 storage, doing some benchmarks and noticing that single thread I/O (80% Read / 20% Write, common OLTP I/O profile) seems slow.
./xdisk -R0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 1 0 80 R -D 7177 56.1 0.090 2.58 0.118 0.116 0.001 2.97 0.216 0.212
Are there parameters in AIX we can tune to push the IO/s and MB/s higher?
STORWIZE V7000 GEN3
IBM Power9
Made sure that the V7000 that is a IBMSVC device is using the recommended AIX_AAPCM driver. I have a 1TB volume (hdisk2) mapped as a JFS2 file system.
# manage_disk_drivers -l
Device Present Driver Driver Options
2810XIV AIX_AAPCM AIX_AAPCM,AIX_non_MPIO
DS4100 AIX_APPCM AIX_APPCM
DS4200 AIX_APPCM AIX_APPCM
DS4300 AIX_APPCM AIX_APPCM
DS4500 AIX_APPCM AIX_APPCM
DS4700 AIX_APPCM AIX_APPCM
DS4800 AIX_APPCM AIX_APPCM
DS3950 AIX_APPCM AIX_APPCM
DS5020 AIX_APPCM AIX_APPCM
DCS3700 AIX_APPCM AIX_APPCM
DCS3860 AIX_APPCM AIX_APPCM
DS5100/DS5300 AIX_APPCM AIX_APPCM
DS3500 AIX_APPCM AIX_APPCM
XIVCTRL MPIO_XIVCTRL MPIO_XIVCTRL,nonMPIO_XIVCTRL
2107DS8K NO_OVERRIDE NO_OVERRIDE,AIX_AAPCM,AIX_non_MPIO
IBMFlash NO_OVERRIDE NO_OVERRIDE,AIX_AAPCM,AIX_non_MPIO
IBMSVC AIX_AAPCM NO_OVERRIDE,AIX_AAPCM,AIX_non_MPIO
# lsdev -Cc disk
hdisk0 Available 01-00 NVMe 4K Flash Disk
hdisk1 Available 02-00 NVMe 4K Flash Disk
hdisk2 Available 05-00-01 MPIO IBM 2076 FC Disk
# lsdev | grep "fw"
sfwcomm0 Available 05-00-01-FF Fibre Channel Storage Framework Comm
sfwcomm1 Available 05-01-01-FF Fibre Channel Storage Framework Comm
sfwcomm2 Available 07-00-01-FF Fibre Channel Storage Framework Comm
sfwcomm3 Available 07-01-01-FF Fibre Channel Storage Framework Comm
sfwcomm4 Available 0A-00-01-FF Fibre Channel Storage Framework Comm
sfwcomm5 Available 0A-01-01-FF Fibre Channel Storage Framework Comm
# lsdev | grep "fcs"
fcs0 Available 05-00 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
fcs1 Available 05-01 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
fcs2 Available 07-00 PCIe2 8Gb 2-Port FC Adapter (77103225141004f3) (not used)
fcs3 Available 07-01 PCIe2 8Gb 2-Port FC Adapter (77103225141004f3) (not used)
fcs4 Available 0A-00 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
fcs5 Available 0A-01 PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103)
# lsattr -l fcs0 -E
DIF_enabled no DIF (T10 protection) enabled True
bus_mem_addr 0x80108000 Bus memory address False
init_link auto INIT Link flags False
intr_msi_1 46 Bus interrupt level False
intr_priority 3 Interrupt priority False
io_dma 256 IO_DMA True
lg_term_dma 0x800000 Long term DMA True
max_xfer_size 0x100000 Maximum Transfer Size True
msi_type msix MSI Interrupt type False
num_cmd_elems 1024 Maximum number of COMMANDS to queue to the adapter True
num_io_queues 8 Desired number of IO queues True
# lsattr -El hdisk2
PCM PCM/friend/fcpother Path Control Module False
PR_key_value none Persistant Reserve Key Value True+
algorithm fail_over Algorithm True+
clr_q no Device CLEARS its Queue on error True
dist_err_pcnt 0 Distributed Error Percentage True
dist_tw_width 50 Distributed Error Sample Time True
hcheck_cmd test_unit_rdy Health Check Command True+
hcheck_interval 60 Health Check Interval True+
hcheck_mode nonactive Health Check Mode True+
location Location Label True+
lun_id 0x0 Logical Unit Number ID False
lun_reset_spt yes LUN Reset Supported True
max_coalesce 0x40000 Maximum Coalesce Size True
max_retry_delay 60 Maximum Quiesce Time True
max_transfer 0x80000 Maximum TRANSFER Size True
node_name 0x5005076810000912 FC Node Name False
pvid 00c2f8708ab7845e0000000000000000 Physical volume identifier False
q_err yes Use QERR bit True
q_type simple Queuing TYPE True
queue_depth 20 Queue DEPTH True+
reassign_to 120 REASSIGN time out value True
reserve_policy single_path Reserve Policy True+
rw_timeout 30 READ/WRITE time out value True
scsi_id 0x20101 SCSI ID False
start_timeout 60 START unit time out value True
timeout_policy fail_path Timeout Policy True+
unique_id 332136005076810818048900000000000001A04214503IBMfcp Unique device identifier False
ww_name 0x5005076810180912 FC World Wide Name False
# lspath -l hdisk2
Enabled hdisk2 fscsi0
Enabled hdisk2 fscsi1
Enabled hdisk2 fscsi4
Enabled hdisk2 fscsi5
# fcstat -D fcs1
FIBRE CHANNEL STATISTICS REPORT: fcs1
Device Type: PCIe3 2-Port 16Gb FC Adapter (df1000e21410f103) (adapter/pciex/df1000e21410f10)
Serial Number: 1A8270057B
ZA: 11.4.415.10
World Wide Node Name: 0x200000109B4CE35E
World Wide Port Name: 0x100000109B4CE35E
FC-4 TYPES:
Supported: 0x0000010000000000000000000000000000000000000000000000000000000000
Active: 0x0000010000000000000000000000000000000000000000000000000000000000
FC-4 TYPES (ULP mappings):
Supported ULPs:
Small Computer System Interface (SCSI) Fibre Channel Protocol (FCP)
Active ULPs:
Small Computer System Interface (SCSI) Fibre Channel Protocol (FCP)
Class of Service: 3
Port Speed (supported): 16 GBIT
Port Speed (running): 16 GBIT
Port FC ID: 0x020200
Port Type: Fabric
Attention Type: Link Up
Topology: Point to Point or Fabric
Seconds Since Last Reset: 446027
Transmit Statistics Receive Statistics
------------------- ------------------
Frames: 681823195 395468348
Words: 298416592384 152800398336
LIP Count: 0
NOS Count: 0
Error Frames: 0
Dumped Frames: 0
Link Failure Count: 1
Loss of Sync Count: 6
Loss of Signal: 3
Primitive Seq Protocol Error Count: 0
Invalid Tx Word Count: 118
Invalid CRC Count: 0
AL_PA Address Granted: 0
Loop Source Physical Address: 0
LIP Type: L_Port Initializing
Link Down N_Port State: Active AC
Link Down N_Port Transmitter State: Reset
Link Down N_Port Receiver State: Reset
Link Down Link Speed: 0 GBIT
Link Down Transmitter Fault: 0
Link Down Unusable: 0
Current N_Port State: Active AC
Current N_Port Transmitter State: Working
Current N_Port Receiver State: Synchronization Acquired
Current Link Speed: 0 GBIT
Current Link Transmitter Fault: 0
Current Link Unusable: 0
Elastic buffer overrun count: 0
Driver Statistics
Number of interrupts: 35576060
Number of spurious interrupts: 0
Long term DMA pool size: 0x800000
I/O DMA pool size: 0
FC SCSI Adapter Driver Queue Statistics
Number of active commands: 0
High water mark of active commands: 20
Number of pending commands: 0
High water mark of pending commands: 20
Number of commands in the Adapter Driver Held off queue: 0
High water mark of number of commands in the Adapter Driver Held off queue: 0
FC SCSI Protocol Driver Queue Statistics
Number of active commands: 0
High water mark of active commands: 20
Number of pending commands: 0
High water mark of pending commands: 1
FC SCSI Adapter Driver Information
No DMA Resource Count: 0
No Adapter Elements Count: 0
No Command Resource Count: 0
FC SCSI Traffic Statistics
Input Requests: 32627778
Output Requests: 20804443
Control Requests: 2490
Input Bytes: 605283225091
Output Bytes: 1191956455792
Adapter Effective max transfer value: 0x100000
Using XDISK 8.6 for AIX 7.2 from here with -OD parameter to open file with O_DIRECT to bypass OS caching and benchmark the storage.
Additional runs with different block/thread settings.
### 8K Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 1 0 80 R -D 7177 56.1 0.090 2.58 0.118 0.116 0.001 2.97 0.216 0.212
### 8K Block, 1 Thread, Sequential I/O Test
./xdisk -S0 -r80 -b 8k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 1 0 80 S -D 6461 50.5 0.001 12.1 0.133 0.116 0.001 9.88 0.238 0.213
### 16K Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 16k -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
16K 1 0 80 R -D 6796 106.2 0.001 2.63 0.126 0.124 0.179 2.89 0.223 0.219
### 16M Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 16M -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
16M 1 0 80 R -D 70 1120 12.9 34.1 14.0 14.2 12.9 15.6 13.2 13.5
### 32M Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 32M -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
32M 1 0 80 R -D 39 1248 23.9 65.0 25.0 24.7 23.8 26.0 24.1 24.3
### 64M Block, 1 Thread, Random I/O Test
./xdisk -R0 -r80 -b 64M -M 1 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
64M 1 0 80 R -D 20 1280 46.4 128 47.7 47.5 46.5 49.3 47.6 47.5
### 8K Block, 2 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 2 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 2 0 80 R -D 10059 78.6 0.001 3.36 0.172 0.130 0.001 3.35 0.298 0.260
### 8K Block, 4 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 4 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 4 0 80 R -D 11914 93.1 0.001 4.22 0.295 0.182 0.001 3.60 0.487 0.431
### 8K Block, 8 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 8 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 8 0 80 R -D 13081 102.2 0.001 4.76 0.568 0.478 0.001 4.18 0.775 0.898
### 8K Block, 16 Thread, Random I/O Test
./xdisk -R0 -r80 -b 8k -M 16 -f /usr1/testing -t60 -OD -V
BS Proc AIO read% IO Flag IO/s MB/s rMin-ms rMax-ms rAvg-ms WrAvg wMin-ms wMax-ms wAvg-ms WwAvg
8K 16 0 80 R -D 13302 103.9 0.001 6.57 1.15 1.29 0.001 5.10 1.42 1.45