Memory usage in AIX server

Hi All,

I have some questions regarding the performance, MEMORY/ Virtual Memory (paging /swap space)

Please see the nmon-MEMORY stats from my AIX LPAR.

24 GB --> RAM   
3456 MB --> Paging Space   
│ Memory ─────────────────────────────────────────────────────────────────────── 
│          Physical  PageSpace |        pages/sec  In     Out | FileSystemCache   
│% Used       97.5%      0.8%  | to Paging Space   0.0    0.0 | (numperm) 64.1%    
│% Free        2.5%     99.2%  | to File System    0.0   24.0 | Process   23.8%    
│MB Used   23973.7MB    26.9MB | Page Scans        0.0        | System     9.7%  
│MB Free     602.3MB  3429.1MB | Page Cycles       0.0        | Free       2.5%  
│Total(MB) 24576.0MB  3456.0MB | Page Steals       0.0        |           ------   
│                              | Page Faults       0.0        | Total    100.0%    
│------------------------------------------------------------ | numclient 64.1%      
│Min/Maxperm     705MB(  3%)  21160MB( 90%) <--% of RAM       | maxclient 90.0%    
│Min/Maxfree     960   1088       Total Virtual   27.4GB      | User      83.7%    
│Min/Maxpgahead    2      8    Accessed Virtual    8.0GB 29.3%| Pinned    14.1%     
│                                                             | lruable pages   6018720.0 
│────────────────────────────────────────────────────────────────────────────────────── 

Question 1)

I see that MEMORY is utilized around 98% all the time. and no paging activity. Is this normal? do i need to react ?
Note: this LPAR has very lite application, system is idle most of the time. But still RAM --> 98% used

Question 2)

USED
----
RAM --> 98%  
paging space --> 1%  
file system caching --> 64%  
process (user/app)--> 24%
system --> 9.7% 
 
FREE
----
free --> 2.5%

As per my understanding,
LPAR will consume entire MEMORY, even if we allocate 50 GB RAM (in my case) to this LPAR. I believe...mostly for filesystem caching. And system will release memory from filesystem caching when ever there is a need for process.

is this correct ? please correct me if i'm wrong

question 3) what is the RAM to Paging space ratio on prod systems ??

In my case,   RAM --> 24 GB 
              Paging Space --> 3456 MB

I think i read some where....that Paging space should be atleast 3/4 of RAM. Please advice.

Thanks,

Don't panic!

AIX is very good at keeping memory full with all sorts of things it believes will be useful. If something is not needed now, there is no reason to actually scrub memory unless another process is allocated that address to work in. I used to worry about this too. It can (as you suggest) be used for all sorts of caching. It is faster than reading from the disk, but there is no point in paging it out, so it just gets lost if not used. When a process then needs the data, it is just read from disk as normal, which is no different than paging back in (unless you have wildly different performance disks) but to use the paging, it would have to keep a record that it existed etc., so it's more effort to page this kind of thing.

Basically, if the server is not paging, then it is fine. :cool:

There used to be a rule of thumb for double paging space for your RAM. Then that changed (even less formally if that's possible) to double RAM for the first 2Gb, then 1Gb for the rest of your RAM allocation.

Of course, what you really need to estimate is the total space you will need and balance the cost against the speed of real memory. Beyond what you can justify paying for, allocate paging to cover it, and a bit more to be sure. Very inexact though. Have some sort of monitor that check paging space, either have a regular look with lsps -a or vmstat , sar or other tools just to keep a check on how much you are really using.

If you are paranoid about crashing the server, allocate lots extra paging space, but don't try to grow /dev/hd6. If you succeed, you will never be able extract it without considerable effort. :eek:

If you allocate 2Gb chunks as paging00, paging01, paging02 ....etc., then when your service is settled, you can work out how many to remove.

If you can space a disk, create a dedicated VG for them in the short term, rather than filling up rootvg and making DR recovery a problem.

I hope that this helps,

Robin
Liverpool/Blackburn
UK

  1. Yes, that�s totally normal. Most RAM is being used to cache files so access to slower disks can be hopefully avoided.
  2. Correct, see 1. It depends on your tuning parameters when and how much memory will be freed. Usually from AIX 5.3 TL... (forgot it), the tuning parameters for VMM are good and usually don�t need adjustment.
  3. If your system actually does page out to Paging Space or page in from there, it will be very slow. So either by tuning or enough memory, you will try to avoid paging at all cost. Therefore these days, you can use a fix amount of Paging Space, which may be rather small compared to RAM. I would use a fix size of 5 GB.

The first thing I look at, for a feeling of how the system has been behaving since boot is vmstat -s
michael@x054:[/home/michael]vmstat -s

            158133157 total address trans. faults
              1263956 page ins
              3733162 page outs
                    0 paging space page ins
                    0 paging space page outs
                    0 total reclaims
             68833738 zero filled pages faults
                92593 executable filled pages faults
                    0 pages examined by clock
                    0 revolutions of the clock hand
                    0 pages freed by the clock
              3246161 backtracks
                    0 free frame waits
                    0 extend XPT waits
              1009416 pending I/O waits
              3108184 start I/Os
              1608563 iodones
            184504298 cpu context switches
             23924465 device interrupts
               431099 software interrupts
             97864862 decrementer interrupts
                    0 mpc-sent interrupts
                    0 mpc-receive interrupts
                    0 phantom interrupts
                    0 traps
           3552750682 syscalls

Further, I prefer the command lsps -s for an accurate view of how much paging space is being used.

michael@x054:[/home/michael]lsps -s
Total Paging Space   Percent Used
      512MB               3%

My rule of thumb is that paging space percent used should be less than 20%. I start very small (512MB for 9G of memory on my current system) - because I do not want to see paging activity to/from paging space. Everything above what is needed is just being wasted.
And I definitely disagree with multiple paging spaces to "tune" paging space. If you paging space is active - you have an application "condition", generally a configuration issue. If not, add more memory. Tuning paging space was acceptble back when a large system had 128MB-512MB of memory, and less than 4GB disk space. This is today - forget best practice anno 1994 - at least for AIX.

Further, as said above, AIX caches stuff in memory. Generally there are two categories to worry about: file and computational.

Computational is best compared with legacy *nix memory model while file memory is everything else that AIX caches via virtual memory manager. The premise is i/o to/from memory is faster than i/o to/from disk.

So, on an AIX system is quite common to see the total writes (page outs) to be larger than the reads (page ins) - see above - because the data is being written to disk but just stays in memory (cached) and a physical i/o (page in) is not needed when the data is needed later.

Rather than the nmon view, try the topas view

Topas Monitor for host:    x054                 EVENTS/QUEUES    FILE/TTY
Tue Aug 13 19:23:32 2013   Interval: 10         Cswitch     766  Readch  1818.9K
                                                Syscall    9865  Writech  289.8K
CPU  User%  Kern%  Wait%  Idle%                 Reads      4190  Rawin         0
ALL   27.3    5.9    0.1   66.7                 Writes      177  Ttyout       69
                                                Forks         0  Igets         0
Network  KBPS   I-Pack  O-Pack   KB-In  KB-Out  Execs         1  Namei       309
Total   499.7    175.6   395.3    13.1   486.6  Runqueue    5.2  Dirblk        0
                                                Waitqueue   0.0
Disk    Busy%     KBPS     TPS KB-Read KB-Writ                   MEMORY
Total     0.3     50.4     3.0     0.0    50.4  PAGING           Real,MB    9216
                                                Faults      384  % Comp     36
FileSystem        KBPS     TPS KB-Read KB-Writ  Steals        0  % Noncomp  12
Total              1.6K    3.8K   1.5K  42.1    PgspIn        0  % Client   12
                                                PgspOut       0
Name            PID  CPU%  PgSp Owner           PageIn        0  PAGING SPACE
httpd       7929976  12.1  15.5 httpd           PageOut       9  Size,MB     512
named9      2556034   9.8  28.0 root            Sios          9  % Used      2
mysqld      6226088   8.2  22.3 mysql                            % Free     98
db2sysc     7078126   0.2  25.8 ldapdb2         NFS (calls/sec)
topas_nm    6160612   0.1   2.6 root            SerV2         0  WPAR Activ
topas      11862064   0.1   1.5 michael         CliV2         0  WPAR Total
nfsd        4128958   0.1   0.3 root            SerV3         0  Press: "h"-help
ksh         5636284   0.0   0.5 root            CliV3         0         "q"-quit
gil          720922   0.0   0.1 root
db2fmcd     5505196   0.0   1.1 root

On the right side, under MEMORY , you can see the Computational ( Comp ) and other/File memory ( noncomp ) and below that the Paging Space . In the middle is the column Paging .

Depending on how quick I want to see results I set the interval to 5, 15 or 60 seconds ( topas -i ## )

Hope this helps with understand what to look for!

Thanks much for your response....@ Robin, @ Zaxxon...and @Michael
I really appreciate your ideas.......

As per your comments/ideas, i understand that
It is not a problem/bottleneck...If RAM (memory) utilization in AIX goes beyond 99% .
The only issue is when there is a paging activity......lets say....paging space goes above 20% (page in/out)

In my case, i see lot of Memory utilization 99% all the time
but 1% paging space.
no paging activity.

REAL MEMORY --> 24 GB
% comp --> 29
% non comp --> 64
% client --> 64

paging space
1 % used

Please correct me if i am wrong.

Yes, you are correct. Your system has no problem with memory as long as the following have no activity/are zero:

vmstat : Columns pi and pi
topas : Values for PgspIn and PgspOut
nmon : Interactive press m and see fields for to Paging Space

For a longterm monitoring it would suit to setup nmon non-interactive per cron and have it permanently collect performance data. In case of problems this data will have a big value. You can process the data graphically with nmon2rrd .

@Zaxxon,

I did verified the system activity (pi/po and other items..etc). It is not even paging. no issues.
And have a cron job (script) which collects nmon performance data every 30 sec interval, saves in to a .nmon file on daily basis. I usually run nmon analyzer to view it.
i've never used nmon2rrd. i just found the link online....and am going to use nmon2rrd tool.
thanks for all your help.

I don't have it but google does:
https://www.ibm.com/developerworks/community/wikis/home?lang=en\#!/wiki/Power%20Systems/page/nmon2rrd

The easy way to know if there has been any paging space activity is to use vmstat -s and look for these lines:

...
                    0 paging space page ins
                    0 paging space page outs
...

Mine are zero because I have rebooted recently, after applying a patch.

FYI: If you see values here that are not equal to zero - it is "normal" when the out values are (much) larger than the in values - when in values are larger - this implies that you really need more memory - because pages are being read in, and freed again before they can be modified (page space page outs are only modified pages, unmodified they just get freed - i.e., no page space out).

The tools mentioned above are for watching behavior in real time. This is historical data - compare delta daily e.g., for 24 hour stats.

re: lrud (least recently used daemon, aka page stealer) - the lrud watches memory limits per memory pool - so you can have a lot of page stealer activity even with 20% of memory free (according to nmon/vmstat/topas) if only one pool is consumed.

How many memory pools do you have? Use vmstat -v

 $ vmstat -v
              2359296 memory pages
              2166980 lruable pages
              1203442 free pages
                    1 memory pools
               557005 pinned pages
                 80.0 maxpin percentage
                  3.0 minperm percentage
                 90.0 maxperm percentage
                 13.7 numperm percentage
               297545 file pages
...