Help in monitoring performance problem in Linux

levic · January 31, 2010, 6:04am

hello,

i'm having some performance problem on one of my linux machines and i hope someone will be able to help me analyzing the problem.

machine info:
Linux fedora, cpu x 4 cores of 1.6Ghz, 8G memory, 8G swap.

i've enabled sar on my machine and created a graph using ksar utility for the last week.

the sar -A command actually monitored everything on my machine.
now, i'm trying to understand what was the problem:)

my problem was that on 2 days at that week my application hanges and we had to restart the machine in order to solve this.
now, i'm trying to understand its its i/o problem due which related to my disk, or perhaps memory problem.

the problem occured at 15:00PM on one of the days i've monitored. i will write down the resluts:

CPU monitor: (0.0 - 30.0) 
   cpu used: 17%.
context: (0-31,000)
   cswch/s: 17,000
i/o: 
  transerf/s (0- 5000) : 3000
  block read/write (0-300,000) : 100,000
  read/write/s (0 - 4000) : 1500
memory:
  memused: 7.8G (97%)
memory misc:
   buffers: (0-700m) : 200m
   cached: (0-7.5G): 7G
swap:
   swapfree: 8G
   swapused: 40kb
 
load:
  runq-sz (0-3): 2.5
  plist-sz(0-400): 350
  load average (0-40) : 30
 
page:
  frmpg/s((-)1000 - 1000) : (-)1000
  bugpg/s: ((-)150 - 150): 50
  campg/s ((-)1000 - 1000): 1000
 
paging: 
   pgpgin/pgpgout  /s (0-80,000) : 30,000
   fault/majflt / s (0-55,000) : 1000
 
processes:
  proc/s (0-60.00): 1

i hope it was clear:(
i can see there is not enough memory cause he used almost all the memory, and the runq-sz is high so a lot of process running/waiting to run i think, so perhaps it might indicate the problem?

please help...

10x

TonyFullerMalv · January 31, 2010, 5:49pm

Looking at what you are showing us the CPU is not being overloaded and you are not drastically short of memory, as you say the run queue size is not good, I think you need to look at the figures for CPU I/O wait and I/O service time (svctm) and see what devices are not responding very quickly (disk storage perhaps?)?
Do you consider your application to be doing a lot of I/O?

ironmask2004 · January 31, 2010, 6:59pm

hi,

what did you mean about "my application hanges"??

do you mean "application crashes" then it's not a performance issue,
maybe it's software issues, OS bug or Application Bug,
or Hardware Failure, Memory or CPU Failure,

levic · February 8, 2010, 9:12am

hi,
thanks a lot for you answer tony.
sorry for the late response i didnt see you answered me.
anyway, how can i check the cpu i/o. i have all this statistics of ksar output.
i though this is all the statistics i can have.
how can i check if the disk are the problem?

thanks

TonyFullerMalv · February 8, 2010, 4:29pm

This may be different for your Unix but for Unbuntu sar(1) you can run:

sar -d

for today's I/O stats or:

sar -d -f /var/log/sysstat/sa16

for the I/O stats from the 16th.
The simple thing to look for is if the figures for one volume are worse than the others and may benefit from being moved to a dedicated disk or even onto a striped volume?

Also take a look at:

sar -n ALL

for network device stats.

The other one to look at is:

sar -q

and see if the run queue size (runq-sz) gets larger when the I/O stats are at there worse.

levic · February 9, 2010, 3:03am

hi tony,
thanks a lot for your help!

i've tried to execute sar -d -f <sa file> and it gives me the error:
Requested activities not available in file

i though that this is due to my sysstat configuration. i've added the SA1_OPTIONS="-d" on my /etc/init.d/sysstat file so it will know to check device statistics, restart sysstat service and run the crontab entry "sa1 -d -I 1 1" but i'm still getting the same error on the sar command:(

perhaps i'm not doing something right.
i've used the ksar to generate a PDF with all the statistics and i am missing i/o wait for disks graphs. so i've added the SA1_OPTIONS="-d" to the file i've told you in order to enable this options. isnt that enough?

thanks

i will

TonyFullerMalv · February 9, 2010, 5:18pm

If you've only just configured sar to collect device I/O stats then none of the previous days sar files will contain that information, presumably on the next day you will be able to run:

sar -d

and see something for that day at least?

levic · February 11, 2010, 6:42am

Hi,

ok, i was finally able to enable the disk statistics using sar. ( i have do add -d on /usr/lib64/sa/sa1 script.

no now i've created new PDF with ksar with all the statisics i can.
the problem is that i dont understand and know what to look for in order to understand the problem.

as you can see i'm not on a cpu problem, i have a vary high load average and high numberof runq-sz.

so, i've write down all the statistics from a specific minutes i'ev chosen. its not the specific time our machine hanged but perhaps we can see the problem in a reguler time.

CPU monitor: (0.0 - 45.0) 
   cpu used: 40% (i/o wait)
context: (0-24,000)
   cswch/s: 18,000
i/o: 
  transerf/s (0- 5000) : 3000
  block read/write (0-80,000) : 50,000
  read/write/s (0 - 5000) : 3000
memory:
  memused: 8G (100%)
  memfree: (0-770M): 380m 
memory misc:
   buffers: (0-760m) : 290m
   cached: (0-7.5G): 6.5G
swap:
   swapfree: 8G
   swapused: 40kb
 
load:
  runq-sz (0-4): 2
  plist-sz(0-350): 320
  load average (0-10) : 1mm-7.5, 5mm-6, 15mm-3
 
page:
  frmpg/s((-)400 - 400) : 200
  bugpg/s: ((-)100 - 100): -25
  campg/s ((-)400 - 400): -150 
paging: 
   pgpgin/pgpgout  /s (0-22,500) : in- 2500, out-10,000
   fault/majflt / s (0-50,000) : 2500
 
processes:
  proc/s (0-57.00): 1

disks:
  sda:
  tps/s (0-500): 280
  read/write /s (0-35,000): write- 6000, read- 27,500
  avgrq-sz (0-200): 100
  avgqu-sz (0-75): 25
  await (0-200): 100
  svctm (0-3): 2.5
  util% (0-100): 90%

sdb:
  tps/s (0-300): 200
  read/write /s (0-32,500): write- 15000, read- 3000
  avgrq-sz (0-400): 150
  avgqu-sz (0-7.5): 1.5
  await (0-150): 1
  svctm (0-10): 2
  util% (0-50): 20

md/0:
  tps/s (0-5,000): 2000
  read/write /s (0-40,000): write- 15000, read- 5,000
  avgrq-sz (0-150): 10
  avgqu-sz (0-75): 20
  await (0-300): 1
  svctm (0-20): 1
  util% (0-75): 75

tony, are you see any problem?
first i though i just need to add memory. cause its looks like he uses all the memory, byt it doesnt make any sense cause there is no paging, and i guess linux allocate all the memory but not realy using it.
perhaps its the disk problem?

any help will be appriciated!
Thanks a lot!

TonyFullerMalv · February 11, 2010, 7:29pm

I am puzzled to 100% of memory in use but very little swap?

cpu used: 40% (i/o wait) is not good, try running iostat, for eample:

iostat -x -n -p ALL

and see what devices are seeing the most I/O, look at queue sizes and await.

levic · February 12, 2010, 4:04am

hi

this is the output for iostat -n -x ALL:

Linux 2.6.23.1-49.fc8 (bastille)        02/12/2010

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
ram0              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram1              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram2              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram3              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram4              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram5              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram6              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram7              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram8              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram9              0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram10             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram11             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram12             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram13             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram14             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
ram15             0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sda              15.86   205.62   70.86   28.00  4845.48  1870.64    67.93     2.50   25.28   2.08  20.61
sdb               0.88   157.19   23.47   13.94  1826.80  1370.70    85.45     0.43   11.54   1.54   5.76
sr0               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
dm-0              0.00     0.00   43.23  263.15  3160.62  2105.20    17.19     3.25   10.58   0.39  11.91

Filesystem:              rBlk_nor/s   wBlk_nor/s   rBlk_dir/s   wBlk_dir/s   rBlk_svr/s   wBlk_svr/s
concorde:/export/home/lab         0.03         0.12         0.00         0.00         0.08         0.13
concorde:/export/home/build         0.03         0.12         0.00         0.00         0.08         0.13

sda and sdb are included LVM partition.
i have a RAID5 in my machine with 5 disks.
perhaps my disks are slow? they are sata 500G 7200rpms.
how it can be he used all the memory and no swap? and he always uses it, not only when he machine is overloaded.

thanks,
i have no idea how to figure it out:(

TonyFullerMalv · February 13, 2010, 9:18am

The interesting lines are:

Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda              15.86   205.62   70.86   28.00  4845.48  1870.64    67.93     2.50   25.28   2.08  20.61
sdb               0.88   157.19   23.47   13.94  1826.80  1370.70    85.45     0.43   11.54   1.54   5.76
dm-0              0.00     0.00   43.23  263.15  3160.62  2105.20    17.19     3.25   10.58   0.39  11.91

Filesystem:              rBlk_nor/s   wBlk_nor/s   rBlk_dir/s   wBlk_dir/s   rBlk_svr/s   wBlk_svr/s
concorde:/export/home/lab      0.03         0.12         0.00         0.00         0.08         0.13
concorde:/export/home/build    0.03         0.12         0.00         0.00         0.08         0.13

The busiest device is dm-0 with 263.15 writes per second but the device with the longest average read queue size is sdb, sda has a higher utilisation and a quite high run queue size.

Run:

# mount | grep sdb
# mount | grep sda

to find out what /dev/sdb and /dev/sda is mounted as.

The NFS mounts are evidently very quiet, but running iostat after your machine has done a lot of real work may show something different.

RAID 5 is good for securing your data but for best performance you need striping, the best compromise is a stripe mirrored against a another stripe.

levic · February 13, 2010, 2:01pm

Hi,

first of all i would likw to thank you a lot for all your help! i'm very appriciate it, so thanks a lot!

sda is my root filesystem. sdb is a disk which has LVM partition so i cant see it on mount command.
i have a rootvg with 3 filesystem which resides on both sdb and additional partition of sda.
i guess md-0 is the LVM as a total. (not sure).

anyway, so we can see the problem is in our disks, this why i have a high i/o wait and load.
in order to change the RAID, or even add more drives to the RAID 5 i need to break the raid, and currenlty its the wrost option, cause its a production machine.
what else can i do in oredr to increase performance?
addidng more cpu is not an option, cause i will still get the i.o wait. the processor will process the requerst much faster and still wait(and much more) to the disks.
i have a 8G RAM. perhaps i need to add more RAM for caching? i'm not sure how do i do that.

i'm not exactly sure i can say loud and clear, the disks are the problem

thanks.

TonyFullerMalv · February 13, 2010, 4:24pm

To be fair RAID 5 is going to be better than a single disc or jut a mirrored pair of discs.
Performance tuning is not an exact art, the figures can look terrible but if the machine is performing the task you want it to do okay then they are not a problem, if the machine is performing the required task then you start looking at the figures to see where the bottle neck is, once you deal with it another bottle neck may also need dealing with.
Suggestions:
Copy the contents of your current disks onto another set of discs in a stripe then reconstruct your RAID 5 set of discs into a stripe and mirror them against the first stripe?
Copy your current RAID 5 configuration onto 10K RPM discs?
Ensure you are using the fastest interface possible for your platform, i.e. SCSI or SATA tend to be faster than IDE?

reborg · February 14, 2010, 11:47am

Possibly.

In this case, unless I am mistaken there is a two disk RAID 5 which would me that:

It is always running as degraded.
Every write has a at least a double overhead.

Also there is a massive IO imbalance between the two disks, possibly due to different workloads hitting the same disk and causing a lot of head shuttle.

No matter how you look at this it appears that you don't have enough disks for what you are doing.

SATA drives are pretty good for sequential I/O so if the workloads are cleanly separated there is a chance that they could be fine. However with a random workload faster disks are definitely preferable.

TonyFullerMalv · February 14, 2010, 12:16pm

If it is 2 disk RAID 5 then it is an utter waste of time and explains the poor I/O performance, take the two disks and mirror one against the other instead.

MarkSeger · April 11, 2010, 7:31am

It's real hard to tell how frequently you're monitoring your statistics and unless you're looking at them very frequently, say every 10 seconds or so, you don't get a lot of help from your data. I do know the default sampling rate for sar is one sample every 10 minutes which might have made sense 20 years ago when computers were a lot slower but make no sense today. When sar tells you your cpu is loaded at 20%, how do you know it wasn't 100% for 2 minutes and 0 for 8? Same holds for everything else.

Personally I find looking at a single number and saying this is the CPU or network loads, isn't all that helpful either since nothing is constant. I find seeing numbers every second or so over a period of time when problems exist are much more helpful, but that's me...

-mark