Hard disk usage is 100 Percent Busy for any command

hi,

AIX 5.3

For any command(say tar command) I am getting 100% busy for my hdisk.

But my CPU and Memory is not busy and have more idle also.

Please advice for any performance analysing.

Thanks in Advance,

First off: if you tar something, the command will read something on your disk (a bunch of files) and then create an archive from it, writing that to the disk (most of times). What else then seeing activity on your disk do you suppose will happen?

Additionally i would like to clarify what the "% tm_act" field in the output of iostat means:

The OS has a sensor, regularily asking the disk if it is busy or not. When the disks aswers half of the times "I'm busy", then the "% tm_act" will be 50%. If the disk answers every time "I'm busy" then tm_act will be 100%, etc.. A disk answers with "busy", when there are requested operations not yet fulfilled, read or write. If many very small requests go to the disk the chance of the sensor asking exactly when one such operation is still open goes up - much more so than the real activity of the disk.

So, "100% busy" does not necessarily mean the disk is at the edge of its trasnfer bandwidth. It could mean either that because the disk is getting relatively few but big requests (example: stream I/O) but it could also mean that the disk is getting a lot of requests which are relatively small so that the disk is occupied most of the time, but not using its complete transfer bandwith.

To find out which is the case analyse the corresponding "Kb_read" and "Kb_wrtn" column from iostat. You know how much a modern disk drive can approximately handle (~17MB/second) physically and bypassing any cache. Compare your data to this (rule-of-thumb-)value and you will get a more detailed picture.

bakunin

Hi,

Understood the concept.

But advice me how to improve the performance.

Not only for tar but even also for chmod...it takes long time to complete.

Any parameter's to be tuned up.?

Thanks in Advance.

Sorry, but i can't tell you "how to improve the performance" because i don't know why the performance is bad - i simply do not know your system!

I take your word that CPU and memory is not an issue (wonder how you came to this conclusion, but anyways) and will concentrate on what else might be the culprit. Possible reasons include (but are in no way limited to):

Maybe your SAN-subsystem has a problem. If it is a ESS look into the errorlog of the system: the SSA-adapters there have batteries supporting the fast-write-cache, these batteries need to be changed from time to time and empty batteries shut down the FW-cache. This could also be watched by dramatically low write-performance together with a normal read-performance.

Maybe you have native SSA-loops, then the problem directly arises with the cache of the adapter. Look in the error-log it should be mentioned there.

Maybe your filesystem has hotspots, get a trace of the filesystem. Use "vmstat -v" to get a first impression or "filemon"/"trcstop" to get a report. A typical trace would look like:

filemon -u -O all -o /tmp/filemon.out ; sleep 10 ; trcstop

If you see in the output that the trace buffers are too small make them bigger by using the -T option:

filemon -u -O all -T 512000 .....

The output is pretty self-explanatory.

If it is an internal disk look into your errorlog for disk failures. Usually this starts with hdisk3-type errors, which are temporary and ends in hdisk4-type errors, which are permanent. The reason is that disks have some spare blocks and bad block relocation takes place first - temporary errors - but once the spare blocks are exhausted damage for the PP can't be prevented - permanent error.

Maybe you are slowing down your filesystem by bad layout - use LVM tools to get map files of all the filesystems and analyze them.

Maybe your system is slow because it is swapping all the time - have a look at the output of "svmon -G" and compare the memory pages "inuse" and "virtual". If "virtual" is much bigger than "inuse" that hints to more memory needed by the running applications than there is. Multiply the number by 4k (size of a memory page) to get a rough estimation of how much more memory you need.

and, and, and .... I could go on for hours with similar considerations, all starting with "maybe". Unless you provide no data nobody can tell you anything about your system.

bakunin