On one of our systems (AIX 5) I am seeing (vmstat) paging intermittently
I want to know which process is causing the paging?
I understand that first I would need to find out which process is consuming most memory
1) Is that right?
2) How to find it out?
3) By googling I found following command but I am not sure what field to look in the output and how to get the it sorted according to memory usage (hight to low)
svmon -P -O summary=basic,unit=MB
options
( Continued description of the valid values for the options parameter).
* sortentity = [ inuse | pin | pgsp | virtual ]
<snip>
inuse
Sorts the reports in decreasing order of real memory consumption
pin
Sorts the reports in decreasing order of pinned memory consumption
pgsp
Sorts the reports in decreasing order of paging space consumption
virtual
Sorts the reports in decreasing order of virtual memory consumption
svmon -P -O summary=basic,unit=MB,sortentity=pgsp
If installed, I would use nmon to monitor the top processes (Option: t).
attached file contains the actual output from systm where Paging is still happnening
and I have following queries on that :
1) in the output of following command many processes are showing Pgsp>0; In that case can we say all the processes with Pgsp>0 are 'causing' the paging on system?
If not a) then what could be causing the paging? b) and then what does this page space indicates?
2) Few processes are showing Pgsp>0 with "svmon" but with "ps aux" those are showing %MEM as 0.0 constantly Ex.PID= 5677206, below
what could be the reason?
3) I understand RSS denotes memory used does it has direct connection with Paging?
4) Could you please advice on relation between inuse,pgsp,virtual?
I am sorry that I am asking to many questions that bto in my reply. But in the past also I have stuck in similar situation and could not found solution
For Paging Space related impacts, only the output of svmon is relevant, as -=Xray=- stated.
RSS is not related to what is being paged out and what not. ps is not useful to analyse this - stick to svmon.
Interessting are those entries, that have lots of pages being paged out and if they grow, ie. page out even more. Use of Paging Space should be avoided as it usually thrashes a system and makes it rather slow up to unusable. As Paging Space is usually being located in the rootvg on the same disks/volumes where the rest of the operating system resides, it slows down general performance since disks are way much slower than RAM.
Anyway it seems there has been allocated a bigger amount of memory for Oracle (SGA?) than you have real memory available.
That might be the reason so much pages are allocated by Oracle.
In your vmstat output you sometimes see counts in the pi and po columns which indicate that some pages are written to Paging Space or written to there, which is something you want to avoid.
High values there are usually the real problematic impact, where you and your users might "feel" the slowness of the system and it's applications on-top ie. being dependent on this Oracle DB running there.
I recommend checking your memory settings of the Oracle DB (SGA?) and adjust it so, that it uses not more than about 80% of your ~57GB RAM. Don't count the Paging Space in for that.
It could be that at day time there is not that much pi/po traffic, but it seems the space has been allocated at some time which could be as well night time.
Maybe some RMAN backups or whatever. Best is to set up some longterm monitoring with nmon for example to check out what causes this.
Inuse: (how many memory pages are in use)
Pin: (how many memory pages are pinned)
Pgsp: (hoe many memory pages reside on paging space)
Virtual: (how many memory pages are in use without program text)
I remember, in the past, allocating 80-85% memory for SGA caused paging and then reducing the SGA resolved it
However in this case I checked the SGA which is 38GB. So that might not be a problem. right?
To clear my understanding I would like to ask part of above questions here again:
1) In the output of following command many processes are showing Pgsp>0; In that case can we say all the processes with Pgsp>0 are 'causing' the paging on system?
I am are these processes 'the cause'?
svmon -P -O summary=basic,unit=MB,sortentity=pgsp
2) Few processes are showing Pgsp>0 with "svmon" but with "ps aux" those are showing %MEM as 0.0 constantly Ex.PID= 5677206, below
what could be the reason?
And as you have thankfully suggested RSS is not related to Paging I have related question which is:
If there is no paging but we are reported a performance issue on the server can we "sort" processes on RSS to check memory consumer as one of the checks?
I mean is it correct to link between RSS and memory consumption?
No, processes which does not active for a (long) time, will moved to the paging space device. At this time, where the VMM move the memory allocated by these process, You will see paging out operation. If this process is still inactive You will see only paging space used and no paging out or paging in operation. When this process becomes avtiv (eg. IO finished, no more lock on a requested resource, etc.) You will see paging in operations.
Maybe this process is currently not activ. So have a look at the %CPU and TIME column for this process.
Another recommendation for nmon. Learn the hotkeys in it and you can get a wealth of info, including process memory usage. svmon, however, is more purpose-built for your desired use, and is endorsed as the proper solution on this IBM official page: Help - AIX 7.1 Information Center
I was trying to show changes in th PgSp column - as that is what you are looking for.
root@x134:[/]svmon -S -t 10 -O filterprop=data,sortentity=pgsp -i 5
Unit: page
Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
2002 - work kernel heap m 2136 1098 0 2136
8002 - work kernel segment m 583 527 0 583
e000 - work mbuf pool m 526 526 0 526
a100 - work kernel heap m 402 205 0 402
9000 - work other kernel segments m 338 0 0 338
7405d - work other kernel segments s 5120 5120 0 5120
854015 - work m 206 0 0 206
14005 - work other kernel segments s 2152 0 0 2152
d001 - work other kernel segments m 114 0 0 114
80c2a3 - work s 1388 0 0 1388
Unit: page
Vsid Esid Type Description PSize Inuse Pin Pgsp Virtual
2002 - work kernel heap m 2136 1098 0 2136
8002 - work kernel segment m 583 527 0 583
e000 - work mbuf pool m 526 526 0 526
a100 - work kernel heap m 402 213 0 402
9000 - work other kernel segments m 338 0 0 338
7405d - work other kernel segments s 5120 5120 0 5120
854015 - work m 206 0 0 206
14005 - work other kernel segments s 2152 0 0 2152
d001 - work other kernel segments m 114 0 0 114
80c2a3 - work s 1388 0 0 1388
root@x134:[/]
Use a larger number than 5 (seconds) - in a real situation I would use at least 60.
So, once you see some regular changes in the PgSp column then you have the segment ID. Use svmon -P to capture everything to a file and then find the process(es) that are using that segment.
(also try having the vmstat -I -w NN running in the background to watch for pi/po activity as well.)