Need to know %age disk busy on AIX

Hi ,

Following alerts are coming for %busy device on a server

Disk Device hdisk5 is 100% busy

Please assist how do I analyse this and also how do I check the %age busy for hdisk5.

Best regards,
Vishal

you can use tools like nmon, topas, vmstat or iostat.

iostat -D hdisk5 2 10

This will activity of hdisk5.

1 Like

As techy1 said, you will have no history of performance data if you do not even record them.

Have a look at filemon

You can get a split down of busy logical volumes from that which might give you a better clue on narrowing it down. Something like this might help when the errors are starting to be logged:-

# filemon -vo /tmp/filename
# sleep 300
# trcstop

You can then inspect filename and see what you have in the Logical Volume section and see if anything enlightens you.

There is less point in running it for longer period as the critical bit will get lost in the general IO of other operations, but you might want to adjust the sleep to get a variety of views. You could even call this in a loop and write to a succession of log files to get a longer picture and it might be possible to translate the output into a CSV and import it to something to draw a graph.

I hope that this helps
Robin
Liverpool/Blackburn
UK

1 Like

Note, though, that "%tt_act" is NOT meaning the disk is taxed this amount of its bandwidth capacity. Especially "100%" does NOT MEAN it is at its limit.

It is like this: the disk (in fact its driver, but nevermind) maintains a queue where commands (like "fetch me some data", etc.) are stored until being executed. In regular intervals the OS queries the disk if this queue is empty or not. The "yes - empty"-answers and "no, not empty"-answers are computed to form a percentage and this percentage is "%tt_act".

While this value is indeed needed to assess the busyness of a disk it is meaningless if it is not combined with other data like averagy queue depth, size of the average read transaction and similar values.

Picture a movie theatres ticket counter: every 5 minutes you ask the clerk if there are people waiting in his queue or not. You do NOT ask if the queue is long or short, if the average customer buys one ticket or several, etc.. From the "yes" (queue bigger than 0) and "no" (queue is exactly 0) answers you compile the "busy" value, but it will not tell you how many people are watching the movie. For this you would need the other mentioned number too.

Back to the disk: if the disk is drowned in many very little requests it might be at 100% but the queue depth will always be very short and in fact a dramatic increase in requests will just make the average queue length a little bigger.

I hope this helps.

bakunin

1 Like