AIX Memory and Performance Doubt

Hi All,
I have 5 Servers ( 3 DataStages Server and 2 Database Servers running HACMP and DPF).
My question is 1 of the Main Core DataStage Server has few unsolved issues that I will post the question as following

1. Why most of time the File Cache in the memory seems constantly 100%? But why this is not happen in Datastage 2 and 3 ? btw, Is File Cache is part of the memory or it is separated from memory ?
2. We experiencing one long performance issues which is the some ETL jobs getting longer to run compare to server initial restart. That's reason why we need to restart servers twice a month. This seems not practical to an Enterprise machine. Anyone has this experience appreciate if you can share and what are the resolution ?
3. Btw, what is the memory allocation rule of thumb for bare minimal server which running 8 core CPU and also for OLTP and DW ? 1core cpu: 4G RAM ? 1:6 or 1:8 ? 

Thanks in advance.



Hopes someone can share some insight on this.

It is memory which the system has borrowed to store file contents temporarily -- if you ask for the same file twice, reading it the second time is as fast as retrieving it from memory. It's just as good as Free, the system will give it up whenever needed.

There's at least two reasons why cache would be so high -- either this server is very idle, letting all memory gradually trickle into cache, or there's high levels of disk activity which push free memory into cache much faster.

They are experiencing different loads. Maybe they have less disk activity, or maybe they're hosting RAM-hungry programs that leave very little room for cache (which is a bad thing, by the way -- cache is good for you).

Cache is part of memory. The system has borrowed it to store file contents, but gives it back just as easily.

AIX file cache is in memory and it stores portions of the file system for faster access. When you state a performance issue it's best to have specific data to support what you are saying. In this case you say that, file system in memory seems constantly at 100%. What commands are you running to get this value?

File system caches can be tuned; however, you should look for simple explanations and the underlying root cause before tuning.

In addition, you mention a number of machine types (data stages, database servers). You also mention HACMP and DPF technologies. Database servers may also have tunning parameters are well. HACMP has tuning considerations as well. While your environmment may be complex, the logic to solve performance problems should be kept simple and specific.

Take for example "ETL jobs getting loger to run compare to server initial restart". This seems to be where you should start to understand your issue. You need to take statistics from these runs.

What resources are consumed (cpu, memory) for these runs when they are done after a server restart. How does this compare when you think you need to boot the system? You should take a look at the system as a whole prior to rebooting it to see why it takes so long for these runs.

Don't worry about "general rules of thumb"...keep this simple and gather the necessary data.

In terms of comparison I would also want to see the output of vmstat -s of the systems you are comparing.

It is too simple to assume that your memory is full because of file caching, unless you have statistics to show that memory is being used for caching. (vmstat -s also can say something here).

Let's assume that you have had some paging to/from paging space over the course of time. You responses may be taking longer because data that should be in (computational) memory is in paging space (where computational memory goes when AIX runs out of space).

Frequently databases do not benefit from file caching, some do, but not all. Rather than have needless page stealing activity there are different mount options you could choose (e.g., cio, rbr, rbw, rbrw - just to mention "two" (rb* are all releases behind options - one behavior type).

Hope this helps!