AIX 6.1 memory tuning

Greetings,

i'm wondering if there is a way to determine minimum memory requirement for AIX kernel and OS functions? We use memdetails script from perfpmr package to see actual memory allocation, for example like this:

===========================================================================
Memory accounting summary                           | 4K Pages |  Megabytes 
----------------------------------------------------|----------|-----------
Total memory in system                              | 16842752 |   65792.00 
  Total memory in use                               | 14972379 |   58485.85 
     Kernel identified memory (segids,wlm_hw_pages) |  3498632 |   13666.53 
     Kernel un-identified memory                    |    20694 |      80.83 
     Fork tree pages                                |        0 |       0.00 
     Large Page Pool free pages                     |        0 |       0.00 
     Huge Page Pool free pages                      |        0 |       0.00 
     User private memory                            |  3195171 |   12481.13 
     User shared memory                             |  3918957 |   15308.42 
     User shared library text memory                |   143312 |     559.81 
     Text/Executable code memory in use             |     8841 |      34.53 
     Text/Executable code memory not in use         |   266575 |    1041.30 
     File memory                                    |  3171274 |   12387.78 
     User un-identifed memory                       |   748923 |    2925.48 
     ----------------------                         |          |            
     Total accounted in-use                         | 14972379 |   58485.85 
  Free memory                                       |  1870373 |    7306.14 
  ----------------------                            |          |            
  Total identified (total ident.+free)              | 16073135 |   62785.68 
  Total unidentified (kernel+user w/ segids)        |   769617 |    3006.31 
  ----------------------                            |          |            
  Total accounted                                   | 16842752 |   65792.00 
  Total unaccounted                                 |  1213075 |    4738.57 

You can see there is still some free memory but also that AIX kernel uses 13 GB on a 64 GB system. How can i tell what will be the maximum kernel memory? We measured this after reboot and starting the applications, kernel took only 2 GB but during 4 months it gradually grew to 13 GB. What would happen if i configured applications to take let's say 60 GB of memory. Will AIX handle that and live with 4 GB for kernel or will it start trashing until reboot is necessary? I can't seem to find any minimum OS requirements so i don't really know how much memory is available to applications. There are no exact figures, everyone only mentions it depends on running services, devices, network etc. etc. but there is no hint of how to calculate the maximum potential usage alltogether.

Anyone has dealt with this before that could help? I have been trying to apply for AIX Performance tuning training for last 3 years so i could ask these questions but since i was always the only one who applied, the course was never opened :slight_smile:

I do not know a formula too calculate the minimum for kernel memory.
As you said the memory usage will grow with time and usage (mbufs, inode cache, jfs bufstructs, etc.) and the increase depends on the maximum installed memory. The Kernel used pinned memory (vmstat -v) and if your application with 60GB also used pinned memory, your server will crash/panic if there are no more memory which can be pinned. That happened to us after two weeks with a wrong Informix memory configuration. The other way the server starts to swap out (paging space) and the performance slows down.

Regards

vmstat -v shows that about 20% of memory pages are pinned (that would roughly correspond to those 13 GB for kernel). Does it mean that application doesn't use memory pinning (server is running Oracle+SAP)? svmon tells me that oracle and workprocesses use about 33 MB of pinned memory, perhaps it's the way they are designed. I'll have to check on some testing system what happens, if you continually increase memory for aplication, how it will affect OS behaviour.

             16842752 memory pages
             16281136 lruable pages
              2796643 free pages
                    5 memory pools
              3415001 pinned pages
                 80.0 maxpin percentage
                  3.0 minperm percentage
                 90.0 maxperm percentage
                 13.8 numperm percentage
              2258042 file pages
                  0.0 compressed percentage
                    0 compressed pages
                 13.8 numclient percentage
                 90.0 maxclient percentage
              2258042 client pages
                    0 remote pageouts scheduled
                  237 pending disk I/Os blocked with no pbuf
                    0 paging space I/Os blocked with no psbuf
                 2228 filesystem I/Os blocked with no fsbuf
                    0 client filesystem I/Os blocked with no fsbuf
                22230 external pager filesystem I/Os blocked with no fsbuf
                 70.0 percentage of memory used for computational pages

hmmm....

First off: Oracle is indeed using "pinned memory", because "pinned memory" is normal memory, but not allowed to be swapped out. Oracle uses it for its "SGA" (system global area) on one hand and for shared memory on the other. If you are interested in the details of allocated shared memory i suggest you use the ipcs command to analyze which process owns which shared memory segment. I can warmly recommend the man page of ipcs , it is a phantastic read.

Second: yes, the kernel accumulates memory over time, but for a different reason: "file memory" is part of the memory accounted to the kernel too, because the kernel "owns" it, so to say. When the system starts and hasn't done anything it has no idea what to put into file cache, so it is initially empty. Over time it is filled and less important things get thrown out in favor of more important ones. The vmo parameters "lru_file_repage", "maxperm" and "minperm" steer the process and i suggest you read up on the vmo command (which sets these options) to understand the process better.

By the way, as Oracle has its own file caching mechanism built into the SGA it might be a wise idea to make the SGA bigger and diminish the AIX filecache accordingly. You might also consider changing the maxperm parameter to 97% instead of its current 90%, but this will probably not have a big effect if the shown values are typical for your machines load.

I hope this helps.

bakunin

In AIX 6.1 lru_file_repage is set to 0 by default. Wouldn't setting this option to 1 cause heavier paging? As you can see there are almost all lruable pages so i guess this would have performance impact. Also restricted options like lru_file_repage are not mentioned in manual pages.

About the kernel using file memory, file memory isn't only used by kernel, is it correct? Currently there is more file memory than kernel memory in use:

===========================================================================
Memory accounting summary                           | 4K Pages |  Megabytes 
----------------------------------------------------|----------|-----------
Total memory in system                              | 16842752 |   65792.00 
  Total memory in use                               | 13098163 |   51164.69 
     Kernel identified memory (segids,wlm_hw_pages) |  4051924 |   15827.82 
     Kernel un-identified memory                    |    18006 |      70.33 
     Fork tree pages                                |        0 |       0.00 
     Large Page Pool free pages                     |        0 |       0.00 
     Huge Page Pool free pages                      |        0 |       0.00 
     User private memory                            |  1379905 |    5390.25 
     User shared memory                             |  1127620 |    4404.76 
     User shared library text memory                |   141072 |     551.06 
     Text/Executable code memory in use             |     8880 |      34.68 
     Text/Executable code memory not in use         |   279924 |    1093.45 
     File memory                                    |  5696733 |   22252.86 
     User un-identifed memory                       |   394099 |    1539.44 
     ----------------------                         |          |            
     Total accounted in-use                         | 13098163 |   51164.69 
  Free memory                                       |  3744589 |   14627.30 
  ----------------------                            |          |            
  Total identified (total ident.+free)              | 16430647 |   64182.21 
  Total unidentified (kernel+user w/ segids)        |   412105 |    1609.78 
  ----------------------                            |          |            
  Total accounted                                   | 16842752 |   65792.00 
  Total unaccounted                                 |  1277893 |    4991.76 

it's being purged once in a while (can't post URL, because i don't have 5 posts yet, so i might add it in the future posts :slight_smile: )

i will explore the ipcs to see if i find something useful, thanks for steering

Maybe, but this was a slight misunderstanding: i was NOT suggesting to set it to 1, i was just mentioning it as an influence towards how the kernel is using and allocating memory.

You might want to read more about how this works by searching for "least recently used [daemon]". This is what "lru" and "lrud" respectively stand for.

"lruable" and "pinned" are two different things: "pinned" means it cannot get swapped out. AIX was, for the longest time, using a "early swap allocation" attitude towards using swap space. As soon as a process started the memory it might need in swap once it might get swapped out completely was calculated and this much amount of swap was allocated immediately. This is why some software manufacturers still insist on swap space being "two (or even three!) times the size of the memory plus 512MB" or similar. A 4.3.3 system with 70% swap used was not necessarily choking at all, but could well be perfectly sized and tuned.

Beginning with 5.0 and 5.1 this was changed to a "late swap allocation" strategy. Swap is now only allocated if swpping really takes place (like it has been under SunOS before).

"lru" on the other hand, means the following: The kernel knows "computational memory" (memory given to programs) and "file memory" (=cache). The "lrud" scans all these pages and - if necessary - tries to "steal" from one to give to the other. What exactly constitutes "necessity" in this case is parameterized by the said values of maxperm , minperm , maxclient , minclient and numperm . How the lrud does this is set by the "lru_file_repage" parameter. You might want to read this article by Jaqui Lynch for more info. Stealing from computational pages may well cause paging operations, so there is some connection between the two areas, but they are not the same at all.

See above: file memory is (simplistically put) otherwise unused memory used to cache I/O operations (aka "disk cache"), nothing else. I have not used the tool you use so it is difficult for me to interpret its output, but what i told you is in accordance with IBM literature, so i suppose it is as correct as it can be.

I hope this helps.

bakunin

1 Like