Strange memory behavior

Hello together,

i have a strange memory behavior on a AIX 7.1 System, which i cannot explain.
The Filesystem-Cache will not be grow up and drops often after few minutes. I know if a file was deleted, that the same segment in the FS-Cache will also be cleared. But i am not sure if this is the correct explanation for "my" behavior.

In one test i read a file (dd if=<some file> of=/dev/null) and after the dd finished the fs-cache (file pages) immediately droped out of the memory, but i couldn't reproduce this test.

Does anyone have an idea?
Thanks

P.S.

  • tunables like vmo or schedo was not changed
  • the System has a lot of nfs V3 exports
# vmstat -v
              6553600 memory pages
              6274304 lruable pages
              3778457 free pages
                    2 memory pools
              1204754 pinned pages
                 90.0 maxpin percentage
                  3.0 minperm percentage
                 90.0 maxperm percentage
                  5.4 numperm percentage
               342616 file pages
                  0.0 compressed percentage
                    0 compressed pages
                  5.4 numclient percentage
                 90.0 maxclient percentage
               342558 client pages
                    0 remote pageouts scheduled
                34333 pending disk I/Os blocked with no pbuf
                    0 paging space I/Os blocked with no psbuf
                 2228 filesystem I/Os blocked with no fsbuf
                73667 client filesystem I/Os blocked with no fsbuf
               284352 external pager filesystem I/Os blocked with no fsbuf
                 37.1 percentage of memory used for computational pages
#lparstat -i
Type                                       : Shared-SMT-4
Mode                                       : Uncapped
Entitled Capacity                          : 2.00
Partition Group-ID                         : 32820
Shared Pool ID                             : 3
Online Virtual CPUs                        : 4
Maximum Virtual CPUs                       : 8
Minimum Virtual CPUs                       : 1
Online Memory                              : 25600 MB
Maximum Memory                             : 61440 MB
Minimum Memory                             : 2048 MB
Variable Capacity Weight                   : 4
Minimum Capacity                           : 0.10
Maximum Capacity                           : 4.00
Capacity Increment                         : 0.01
Maximum Physical CPUs in system            : 64
Active Physical CPUs in system             : 32
Active CPUs in Pool                        : 12
Shared Physical CPUs in system             : 12
Maximum Capacity of Pool                   : 3200
Entitled Capacity of Pool                  : 420
Unallocated Capacity                       : 0.00
Physical CPU Percentage                    : 50.00%
Unallocated Weight                         : 0
Memory Mode                                : Dedicated
Total I/O Memory Entitlement               : -
Variable Memory Capacity Weight            : -
Memory Pool ID                             : -
Physical Memory in the Pool                : -
Hypervisor Page Size                       : -
Unallocated Variable Memory Capacity Weight: -
Unallocated I/O Memory entitlement         : -
Memory Group ID of LPAR                    : -
Desired Virtual CPUs                       : 4
Desired Memory                             : 25600 MB
Desired Variable Capacity Weight           : 4
Desired Capacity                           : 2.00
Target Memory Expansion Factor             : -
Target Memory Expansion Size               : -
Power Saving Mode                          : Disabled

Tuning the AIX file caches - Wikistix

Maybe it ignores pages from closed files, which might support a reopened file.

Very interesting. Alas, i have no immediate answer, only some observations:

                34333 pending disk I/Os blocked with no pbuf
                    0 paging space I/Os blocked with no psbuf
                 2228 filesystem I/Os blocked with no fsbuf
                73667 client filesystem I/Os blocked with no fsbuf
               284352 external pager filesystem I/Os blocked with no fsbuf

these numbers look relatively high. If they remain constant the problem was probably somewhere in the past as the numbers are collected since reboot. You might want to watch them closely, though: if you notice a sharp increase chances are your system is I/O-bound somehow.

              1204754 pinned pages

This is roughly 1GB memory pinned. Do you have a database running on the system? The Oracle SGA, for instance, is mostly pinned memory. "pinned" means "not to be swapped out in case swapping is necessary".

Thank you for your reply.

@DGPickett

Maybe it ignores pages from closed files, which might support a reopened file.     

Is there a way to check this? Something like filemon?

@bakunin
I allready begun to tune the System but our Storage isn't the fastes. :wink:

ioo -p -o j2_dynamicBufferPreallocation=256
ioo -p -o numfsbufs=4096
lvmo -v <VG> -o pv_pbuf_count=2048
              1204754 pinned pages

There is no Database or something like this but AFAIK with AIX 7.1 the Kernel is pinned.

According to AIX 7.1 Differences Guide the memory is not pinned but locked (see "5.9 Kernel memory pinning", p199). It might be that this "not-pinned-but-locked" kernel memory is counted as pinned for the purposes of "vmstat", so you are probably right.

I hope this helps.

bakunin

Used to call it wired. In MULTICS, temp files could be created with no backing store, so they were de facto wired. You could bring a sysem to its knees with too much, but it was great for stress testing the apps.

Closed files with not buf sounds like pipes and sockets in a would-block state, possibly with a blocked thread but perhaps just not firing bits into select() or poll(). If they are accumulating, there may be something undermining keepalive for detecting broken connections, or some privileged app leaving sockets open. lsof can tell you about all open fd.

UPDATE

I found the following article:

https://www.ibm.com/developerworks/wikis/display/WikiPtype/AIXJ2inode

Ralf Schmidt-Dannert comments that the defaults for the j2_inodeCacheSize and
j2_metadataCacheSize parameters in AIX V7.1 have been changed to 200. 
An implication is that setting both parameters to 200 is likely to prove satisfactory
on most AIX V5.3 and V6.1 LPARs.

I changed them back to the default 400 and after this I increase them to 500.
Now the whole memory is used. I am not sure what's going on exactly.
This system has a lot of filesystems with a lot of small files (cobol sources, etc.).
Maybe the insufficent Inode-cache prevent the System to use the whole FS-cache?

Thanks for posting the update.

I noticed in another project and under different circumstances that the speed with which to acquire file metadata (basically the contents of inodes) can dramatically speed up file operations:

A network relied heavily on NFS-shares and was redesigned to operate from one huge GPFS-datapool (~500TB). The first thing noticeable was that KDE had to be removed from all the clients, because the damn thing tries to create a hidden file-DB at startup. This is fine when you have a local disk with 20k files, but not when you see several millions of them. (It might be possible to tweak KDE somehow to stop that, but nobody bothered to do so. Desktops are a waste of resources anyway.)

The second observable phenomenon was that backup/restore times could be dramatically improved by moving the metadata onto a SSD. It didn't even have to be big in size: 200-300GB would suffice.

Now this fits in well with what you say about cache sizes and metadata caching. Probably AIX file I/O can be improved by tweaking the resources set aside specifically for dealing with file metadata.

Thanks again for sharing.

bakunin

I hear that some file systems never run out of inodes. :smiley:

Too much RAM set aside can make the VM side thrash noticably. Some systems do their metadata cache in swappable VM or mmap()'d file space so the less reused parts can be rolled out, dynamically. Can AIX be set for VM based caching? With most of the RAM in one pool, all the users can share it.