Which Process is causing Paging?

Hello

On one of our systems (AIX 5) I am seeing (vmstat) paging intermittently
I want to know which process is causing the paging?

I understand that first I would need to find out which process is consuming most memory
1) Is that right?
2) How to find it out?

3) By googling I found following command but I am not sure what field to look in the output and how to get the it sorted according to memory usage (hight to low)
svmon -P -O summary=basic,unit=MB

4) can TOPAS help in that regard?

Thanks and Regards
Chetanz

Hi Chetanz,

Your are on the right way...

 man svmon
      options
            ( Continued description of the valid values for the options parameter).
              *    sortentity = [ inuse | pin | pgsp | virtual ]
<snip>
                     inuse
                          Sorts the reports in decreasing order of real memory consumption
                     pin
                          Sorts the reports in decreasing order of pinned memory consumption
                     pgsp
                          Sorts the reports in decreasing order of paging space consumption
                     virtual
                          Sorts the reports in decreasing order of virtual memory consumption
svmon -P -O summary=basic,unit=MB,sortentity=pgsp

If installed, I would use nmon to monitor the top processes (Option: t).

PS
In this context You should have a look on the AIX VMM page replacment algorithm.
http://www.ibm.com/developerworks/aix/library/au-vmm/

vmo -h lru_file_repage

Hello -=XrAy=-

Many Thanks for your reply & help

attached file contains the actual output from systm where Paging is still happnening
and I have following queries on that :

1) in the output of following command many processes are showing Pgsp>0; In that case can we say all the processes with Pgsp>0 are 'causing' the paging on system?

"svmon -P -O summary=basic,unit=MB,sortentity=pgsp

If not a) then what could be causing the paging? b) and then what does this page space indicates?

2) Few processes are showing Pgsp>0 with "svmon" but with "ps aux" those are showing %MEM as 0.0 constantly Ex.PID= 5677206, below
what could be the reason?

3) I understand RSS denotes memory used does it has direct connection with Paging?

4) Could you please advice on relation between inuse,pgsp,virtual?
I am sorry that I am asking to many questions that bto in my reply. But in the past also I have stuck in similar situation and could not found solution

Thanks and Regards
Chetanz

Please find actual Output in the file attached

For Paging Space related impacts, only the output of svmon is relevant, as -=Xray=- stated.
RSS is not related to what is being paged out and what not. ps is not useful to analyse this - stick to svmon.

Interessting are those entries, that have lots of pages being paged out and if they grow, ie. page out even more. Use of Paging Space should be avoided as it usually thrashes a system and makes it rather slow up to unusable. As Paging Space is usually being located in the rootvg on the same disks/volumes where the rest of the operating system resides, it slows down general performance since disks are way much slower than RAM.
Anyway it seems there has been allocated a bigger amount of memory for Oracle (SGA?) than you have real memory available.
That might be the reason so much pages are allocated by Oracle.

In your vmstat output you sometimes see counts in the pi and po columns which indicate that some pages are written to Paging Space or written to there, which is something you want to avoid.
High values there are usually the real problematic impact, where you and your users might "feel" the slowness of the system and it's applications on-top ie. being dependent on this Oracle DB running there.

I recommend checking your memory settings of the Oracle DB (SGA?) and adjust it so, that it uses not more than about 80% of your ~57GB RAM. Don't count the Paging Space in for that.
It could be that at day time there is not that much pi/po traffic, but it seems the space has been allocated at some time which could be as well night time.
Maybe some RMAN backups or whatever. Best is to set up some longterm monitoring with nmon for example to check out what causes this.

Inuse: (how many memory pages are in use)
Pin: (how many memory pages are pinned)
Pgsp: (hoe many memory pages reside on paging space)
Virtual: (how many memory pages are in use without program text)

Please have a look at the following nice articles:
VMM concepts
Some notes regarding memory leak

Regards

Hello zaxxon

SGA check is a good pointer! Many thanks for that

I remember, in the past, allocating 80-85% memory for SGA caused paging and then reducing the SGA resolved it
However in this case I checked the SGA which is 38GB. So that might not be a problem. right?

To clear my understanding I would like to ask part of above questions here again:

1) In the output of following command many processes are showing Pgsp>0; In that case can we say all the processes with Pgsp>0 are 'causing' the paging on system?
I am are these processes 'the cause'?

svmon -P -O summary=basic,unit=MB,sortentity=pgsp

2) Few processes are showing Pgsp>0 with "svmon" but with "ps aux" those are showing %MEM as 0.0 constantly Ex.PID= 5677206, below
what could be the reason?

And as you have thankfully suggested RSS is not related to Paging I have related question which is:
If there is no paging but we are reported a performance issue on the server can we "sort" processes on RSS to check memory consumer as one of the checks?

I mean is it correct to link between RSS and memory consumption?

Thanks and Regards
Chetanz

No, processes which does not active for a (long) time, will moved to the paging space device. At this time, where the VMM move the memory allocated by these process, You will see paging out operation. If this process is still inactive You will see only paging space used and no paging out or paging in operation. When this process becomes avtiv (eg. IO finished, no more lock on a requested resource, etc.) You will see paging in operations.

Maybe this process is currently not activ. So have a look at the %CPU and TIME column for this process.

Have You tried to run NMON?

Another recommendation for nmon. Learn the hotkeys in it and you can get a wealth of info, including process memory usage. svmon, however, is more purpose-built for your desired use, and is endorsed as the proper solution on this IBM official page:
Help - AIX 7.1 Information Center

I tried very hard to get my system to do some paging, but failed.

System configuration: lcpu=2 mem=1037MB ent=0.10

   kthr            memory                         page                       faults                 cpu          
----------- --------------------- ------------------------------------ ------------------ -----------------------
  r   b   p        avm        fre    fi    fo    pi    po    fr     sr    in     sy    cs us sy id wa    pc    ec
  1   0   0     112474     104792     0     0     0     0     0      0   248 694475  3644 63 30  7  0  0.92 920.2
  2   0   0     112297     104969     0     0     0     0     0      0   127 1559231  1965 58 39  3  0  0.95 951.8
  1   1   0     112215      95572   733  1107     0     0     0      0   398 1348343  1639 39 51  9  1  0.83 825.8
  2   1   0     112343      73591  1022  3262     0     0     0      0   237 793101  1657 49 42  5  4  0.69 691.9
  3   1   0     113415      44813  1330  4687     0     0     0      0   213 338912  1621 59 37  4  0  0.47 466.9
  1   1   0     113538      37040   327  1818     0     0     0      0   301 1204270  1495 40 49  8  4  0.74 735.0
  2   0   0     113697      19961  1191  2200     0     0     0      0   314 1208027  1580 43 48  8  0  0.77 774.1
  2   0   0     113891       3486  1346  2707     0     0   547   3611   178 633671  1304 49 47  4  0  0.52 518.3
  2   0   0     113904       3392  1728  4021     0     0  5829  11159   267 212343  2064 56 37  7  0  0.51 507.3
  2   0   0     113952       3442  1023  3449     0     0  4770   6121   253 1088084  1550 46 48  5  0  0.73 727.4
  3   0   0     113952       3489  1737  4020     0     0  6452   8365   245 901676  1772 54 45  1  0  0.69 686.1
  2   0   0     114160       4092   715  4090     0     0  4127   4246   162 1132349  1163 43 49  8  0  0.74 736.6
  1   0   0     114823       4450   505   280     0     0  1592   1592   792  58405  2094 27 51 22  0  0.25 246.4
  1   0   0     115271       3206   204   172     0     0   506    506   464 1377444  1177 36 51 13  0  0.82 819.7
  1  16   0     115431       3793   608  1745     0     0  2573   2758   255 1336914  1006 40 50  8  1  0.81 810.7
  2   0   0     115496       3560  1433  3027     0     0  4732   7179   277 655654  1657 48 45  7  0  0.61 605.9
  1   0   0     115512       3561  2047  2584     0     0  4766   6114   259 608363  1519 47 45  8  0  0.59 589.4
  2   0   0     115614       3926  1700  1976     0     0  3379   5043   274 1219711  1444 41 50  9  0  0.79 792.1

I was trying to show changes in th PgSp column - as that is what you are looking for.

root@x134:[/]svmon -S -t 10 -O filterprop=data,sortentity=pgsp -i 5
Unit: page

    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
    2002         - work kernel heap                  m   2136  1098    0    2136
    8002         - work kernel segment               m    583   527    0     583
    e000         - work mbuf pool                    m    526   526    0     526
    a100         - work kernel heap                  m    402   205    0     402
    9000         - work other kernel segments        m    338     0    0     338
   7405d         - work other kernel segments        s   5120  5120    0    5120
  854015         - work                              m    206     0    0     206
   14005         - work other kernel segments        s   2152     0    0    2152
    d001         - work other kernel segments        m    114     0    0     114
  80c2a3         - work                              s   1388     0    0    1388
Unit: page

    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
    2002         - work kernel heap                  m   2136  1098    0    2136
    8002         - work kernel segment               m    583   527    0     583
    e000         - work mbuf pool                    m    526   526    0     526
    a100         - work kernel heap                  m    402   213    0     402
    9000         - work other kernel segments        m    338     0    0     338
   7405d         - work other kernel segments        s   5120  5120    0    5120
  854015         - work                              m    206     0    0     206
   14005         - work other kernel segments        s   2152     0    0    2152
    d001         - work other kernel segments        m    114     0    0     114
  80c2a3         - work                              s   1388     0    0    1388
root@x134:[/]

Use a larger number than 5 (seconds) - in a real situation I would use at least 60.

So, once you see some regular changes in the PgSp column then you have the segment ID. Use svmon -P to capture everything to a file and then find the process(es) that are using that segment.

(also try having the vmstat -I -w NN running in the background to watch for pi/po activity as well.)

Hope this helps!