i see that our system is paging and paging space usage grows, a the same time i see that usage of Comp memory grows and Client memory decreases.
# lsps -s
Total Paging Space Percent Used
65792MB 33%
MEMORY
Real,MB 81920
% Comp 69
% Noncomp 30
% Client 30
Can i assume that once balance changes in Comp and Noncomp memory usage, and paging space increases as well it means that Client(Noncomp) pages were paged out to paging space?
The main question is: How to identify what type of pages (Computational or Client) were paged out or paged in? Where can i see such statistiscs in AIX?
Overview of AIX page replacement covers monitoring, perhaps you can compare totals in real time to see what sort of activity is happening, if there is no ready built tool.
there are values of computational and non-computational memory in svmon output, but it shows only what is in real memory, and not what is in paging space...
i'm more interested in the page out rate of computational and non-computational pages.
if i see that Non-comp memory decreased, then it was paged out to a file system and not to to a paging space itself, if i'm not mistaken, but this is still a page out activity. then why paging space usage is growing? is it filled with computational pages only?
Can we assume this is an untuned box running AIX 5.3 ? If so set lru_file_repage to 0 and the paging will dramatically decrease. There are a bunch of threads here on what else you can do ... if your box is properly tuned, than most likely you are paging non-comp but to me it doesn't look like.
can you show us your vmstat -Iwt 2 10 output from a real busy time of your system ?
Your box should not page at all as far as I can see - it rather should scan and free pages when your free list is very small. I had a similar problem just a few weeks ago with one of my AIX 6.1 boxes (and IBM could not see any problem at first). Like you I had about 25 GB memory non-comp memory what is for most systems more than sufficient - but my box was scanning and freeing itself to death. Being escalated to 3rd level because I insisted of having a problem, they provided me an efix that solved this problem immediately and permanently as this strange 'paging even though there is sufficient memory + scanning / freeing excessively' is a known bug. So my question would be how high is your scan to free ratio right now ?
Maybe it is similar on your box? They gave me PTF U837435
from this output, I would only recommend to add 20 GB memory and your paging will stop. From your stats you are using close to 100% memory computational - no wonder that your box is paging - and yes this will for sure be DB content as well ... and if your DBAs are doing rman backups on top it becomes real bad.
What you could try is mount your oracle filesystems with noatime option and switch oracle to setall - this will give you more free memory into your free list and may reduce your memory footprint.
i found in one IBM book, that memory needs are virtual + pers + clnt from svmon output.
so the deficit was 19496715 + 5 + 6447378 - 20971520 = 4972578 * 4Kb = 19890312 Kb = 19Gb
are these calculations correct?
# svmon -G
size inuse free pin virtual mmode
memory 20971520 20895377 10607 4564324 19496715 Ded
pg space 16842752 5667782
work pers clnt other
pin 3952074 0 0 612250
in use 14447994 5 6447378
PageSize PoolSize inuse pgsp pin virtual
s 4 KB - 10584481 5480166 811284 9154139
m 64 KB - 644431 11726 234565 646411
i don't understand yet what is "virtual" represent in svmon. the value 19496715 is greater than paging space and lower than real memory. can you explain this?
i can show you some visual graphs captured by nmon for the last 3 days. we are interested in the time from 8 to 17 o'clock
this from last Tuesday when system was paging much
this is from Thursday, when the system almos did not page from 08 till 17 and as you see all this time computational memory was on the level of 75%-80%
it seems to me that system is comfortable when computational memory consumes 80% of the total memory, which is 80Gb. 20% is used by the system, so application needs are 60% of 80Gb which is 48Gb.
I have a few hundred oracle boxes - in my experience the systems are most comfortable when comp (the avm value in vmstat x 4k) doesnt exceed 80% as this leaves enough memory for all the oracle forked processes, IO buffering, batch processing and so on.
When my memory utilization exceeds these 80% than my system starts scanning / freeing memory which utilizes cpu and slows down the DB as the system waits for sufficient freed up memory to continue processing - which obviously is bad. The higher the scan to free ratio - so the more pages need to be scanned to free up the memory I actually need for the given workload - the slower the system gets and the more cpu is utilized. So I make sure I always have plenty of memory - as particularly for oracle the need of non-comp memory is very valid as its usually a filesystem based DB - and not finding filecache if needed slows down the DB too as no IO can happen ...
Please note - during rman backups you still will see some scan and free as this puts - at least in my environments - a large amount of additional load onto the systems. So my 80% are during busy times but not when rman runs. Nmon is pretty helpful to find out what is good for your system and when you do have your busy times.
Virtual memory btw is physical memory + pagingspace in 4k pages. Virtual memory in use is how much of this you are actively using - ideally visibly less than you physically have
Isn't this almost 25GB FS data in physical memory while having >20GB in the paging?
Reducing (forcing down) FS caching would be one way to go in my opinion, why an Oracle box needs to cache 25GB FS data?
On thing that can run paging up unnecessarily is static linking of apps, something you can change for apps developed locally. All apps using libs dynamically linked are using the same pages, not copies, and since those pages are more frequently referenced, they stay in RAM.
We used to get a lot of RDBMS out of a small platform by designing batch processes to process N records at a time and then commit. We also found that over-use of updatable cursors increased processing. The way Oracle works, a long select can end up owning many pages as other processes update or delete those rows. So, it helps the whole system to do things in small batches, and even in select programs, a commit may release pages tied up by update-capable cursors. If you think of it, even processing 128 records per commit, you have 99+% of any economy of scale over one at a time. Any locks are released sooner, so interactive can get access. As batches get smaller, working set pages are in RAM or CPU cache more often, and finished pages can roll out and not soon return. Smaller batches also are more likely not to overwhelm cachng and buffering in disk subsystems, slowing I/O to media speed. Also, the system tuning does not change on more active days, just the batch run time.
Interactive row sets tend to be small, but batch can bring a lot of pages, an unpredicatable number, into play at once.
and the number of "external pager filesystem I/Os blocked with no fsbuf" is growing every minute with 20-30 blocked operations.
# while true; do date; vmstat -v | grep external; sleep 10; done
Wed 9 Mar 11:19:24 2011
257815036 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:19:34 2011
257815036 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:19:44 2011
257815036 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:19:54 2011
257815036 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:20:04 2011
257815036 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:20:14 2011
257815045 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:20:24 2011
257815058 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:20:34 2011
257815058 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:20:44 2011
257815087 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:20:54 2011
257815087 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:21:04 2011
257815087 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:21:14 2011
257815087 external pager filesystem I/Os blocked with no fsbuf
Wed 9 Mar 11:21:24 2011
257815087 external pager filesystem I/Os blocked with no fsbuf
i'm going to increase the value of j2_dynamicBufferPreallocation, which is equal to 16 now.
Can you suggest, how to determine what should i set for j2_dynamicBufferPreallocation ?
that largely depends on your workload - on our systems it is usually 128 or 256.
You should try to find out which volumegroup needs all the filesystem buffers that you dont have - you can than add buffers via lvmo command to just this volumegroup ...