per-process swap usage?

I've been trying to write a monitor program that gets various values on the processes running and makes reports. One of the values that I wanted to monitor was swap usage, so that we can ensure that our swap space doesn't fill up, but I can't seem to get this number (on a per-process basis) from either 'top' or 'ps'.

I've tried everything I can think of. Can anyone tell me how to do this?

On my system, I run top and ps and sum up some of the numbers and I get these:

top vsize within 5% ps vsize: 1.0%
top rss within 5% ps rss: 98.3%
Resident set size: 6.422G
Top vsize: 11.980G
PS vsize: 144.780G
PS size: 88.384G

These are the totals I get from the 'top' header. This (roughly) agrees with the vmstat numbers.
Total Mem used, from top: 7.590G
Total Swap used, from top: 5.233G
Total Virtual Mem used: 12.823G

Now you see the closest thing to matching the total virtual memory above is the top vsize, and maybe I can get swap usage by subtracting top vsize from RSS, but the above was run on RH3. If I run the same script on RH5 (we have RH3 and RH5 machines), I get the following:

top vsize within 5% ps vsize: 99.7%
top rss within 5% ps rss: 97.1%
Resident set size: 5.059G
Top vsize: 84.335G
PS vsize: 84.485G
PS size: 6.357G

Total Mem used, from top: 7.751G
Total Swap used, from top: 0.000G
Total Virtual Mem used: 7.751G

As you can see, now the top vsize is way off.

I also tried subtracting out the shared memory (taken from top), but that only lowered the numbers by less than 5%.

There has to be a better way...

--Buck

sar, swapinfo: it would help A LOT if you gave us your OS and architecture.

RHEL3 and 5. They're 64-bit AMD machines.

Here's a little more info. There's two classes of machines I'm concerned with.

$uname -a
Linux fub 2.6.18-53.1.13.el5 #1 SMP Mon Feb 11 13:27:27 EST 2008 x86_64 x86_64 x86_64 GNU/Linux

$lsb_release -a
LSB Version: :core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch
Distributor ID: RedHatEnterpriseClient
Description: Red Hat Enterprise Linux Client release 5.1 (Tikanga)
Release: 5.1
Codename: Tikanga

>uname -a
Linux bar 2.4.21-37.ELsmp #1 SMP Wed Sep 7 13:32:18 EDT 2005 x86_64 x86_64 x86_64 GNU/Linux
> [bgolemon@lslogin10] [Sat 10:20am] ~>lsb_release -a
LSB Version: 1.3
Distributor ID: RedHatEnterpriseWS
Description: Red Hat Enterprise Linux WS release 3 (Taroon Update 6)
Release: 3
Codename: TaroonUpdate6

I looked into sar, and it looks like it gives statistics at the machine level, but not process-specific numbers.

'swapinfo' doesn't appear to be available.

I feel like this should be a simple problem that should be solved by an option to ps, but I can't seem to get it done.

--Buck

As an aside, the 'ps' manpage documents the 'v' option as giving a virtual memory format, but doesn't document the fields presented. Can anyone tell me what these things mean? MAJFL is the major page fault count, I believe, and RSS is the "resident set size" meaning physical memory used, but what is "DRS"?

$ps v
PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND
22511 pts/2 Ss 0:00 0 319 17576 3080 0.0 -tcsh
27192 pts/2 R+ 0:00 0 74 8261 728 0.0 ps v

In the meantime, I've tried using the "size" output from ps, since it seems most accurate (see original post), but I've hit a snag in that the number is many times too large for a couple types of processes (wine-preloader and java).

Examples:

wine-preloader(21951):
        PS:     rss:484.0       vsize:2367920.0 size:2345200.0  sz:591980.0
        TOP:    rss:480.0       vsize:872.0     share:476.0
wine-preloader(6990):
        PS:     rss:1300.0      vsize:2367924.0 size:2345204.0  sz:591981.0
        TOP:    rss:1292.0      vsize:1292.0    share:888.0
wine-preloader(24682):
        PS:     rss:452.0       vsize:2367912.0 size:2345192.0  sz:591978.0
        TOP:    rss:448.0       vsize:832.0     share:444.0

java(24885):
        PS:     rss:636.0       vsize:680912.0  size:634740.0   sz:170228.0
        TOP:    rss:620.0       vsize:10088.0   share:388.0
java(19015):
        PS:     rss:42336.0     vsize:1253548.0 size:1197556.0  sz:313387.0
        TOP:    rss:41984.0     vsize:42324.0   share:14836.0

TRS maybe Text Resident Size.
DRS maybe Data Resident Size.

Looking more closely, I found that (under Redhat3, Linux 2.4) /proc/$PID/statm shows the correct vmsize number, while /proc/$PID/statm shows the grossly inflated vmsize number.

Under RH5 (Linux 2.6) both files agree on the inflated number.