I've been trying to write a monitor program that gets various values on the processes running and makes reports. One of the values that I wanted to monitor was swap usage, so that we can ensure that our swap space doesn't fill up, but I can't seem to get this number (on a per-process basis) from either 'top' or 'ps'.
I've tried everything I can think of. Can anyone tell me how to do this?
On my system, I run top and ps and sum up some of the numbers and I get these:
top vsize within 5% ps vsize: 1.0%
top rss within 5% ps rss: 98.3%
Resident set size: 6.422G
Top vsize: 11.980G
PS vsize: 144.780G
PS size: 88.384G
These are the totals I get from the 'top' header. This (roughly) agrees with the vmstat numbers.
Total Mem used, from top: 7.590G
Total Swap used, from top: 5.233G
Total Virtual Mem used: 12.823G
Now you see the closest thing to matching the total virtual memory above is the top vsize, and maybe I can get swap usage by subtracting top vsize from RSS, but the above was run on RH3. If I run the same script on RH5 (we have RH3 and RH5 machines), I get the following:
top vsize within 5% ps vsize: 99.7%
top rss within 5% ps rss: 97.1%
Resident set size: 5.059G
Top vsize: 84.335G
PS vsize: 84.485G
PS size: 6.357G
Total Mem used, from top: 7.751G
Total Swap used, from top: 0.000G
Total Virtual Mem used: 7.751G
As you can see, now the top vsize is way off.
I also tried subtracting out the shared memory (taken from top), but that only lowered the numbers by less than 5%.
As an aside, the 'ps' manpage documents the 'v' option as giving a virtual memory format, but doesn't document the fields presented. Can anyone tell me what these things mean? MAJFL is the major page fault count, I believe, and RSS is the "resident set size" meaning physical memory used, but what is "DRS"?
$ps v
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
22511 pts/2 Ss 0:00 0 319 17576 3080 0.0 -tcsh
27192 pts/2 R+ 0:00 0 74 8261 728 0.0 ps v
In the meantime, I've tried using the "size" output from ps, since it seems most accurate (see original post), but I've hit a snag in that the number is many times too large for a couple types of processes (wine-preloader and java).
Looking more closely, I found that (under Redhat3, Linux 2.4) /proc/$PID/statm shows the correct vmsize number, while /proc/$PID/statm shows the grossly inflated vmsize number.
Under RH5 (Linux 2.6) both files agree on the inflated number.