High Page In and Executable page In

Hi,

Currently I'm experience very high page ins on my system running on solaris 10.

From vmstat, the page ins figure is very high, further drill down shows the page ins are from file system and occassional spike in executable page ins.

Details as follow:

oracle@perch:/files>> vmstat 5
 kthr      memory            page            disk          faults      cpu
 r b w   swap     free   re   mf     pi      po fr de sr m1 m1 m2 m2   in   sy   cs us sy id
 3 0 0 7268376 5110432 1041 2719 729590752414 753 753 0 0 6 0 6 0 7766 100603 12925 56 19 25
 5 0 0 11818608 9292016 451 918 13573290886586 3 3 0 0 2 0 1 0 10952 161811 14018 80 20 0
 2 0 0 11818432 9292960 385 792 13459633370179 2 2 0 0 1 0 2 0 10053 165300 13082 81 19 1
 5 0 0 11819720 9294472 322 623 12844506325972 0 0 0 0 1 0 1 0 11025 162384 14302 81 19 0
 3 0 0 11819400 9294664 468 1132 12864149119550 0 0 0 0 5 0 5 0 10555 163452 13193 78 21 1
 5 0 0 11818912 9295784 430 926 11488048016240 3 3 0 0 1 0 1 0 10935 158379 14086 80 20 0
 2 0 0 11819560 9296824 381 906 12037450017951 0 0 0 0 1 0 1 0 10671 165815 13422 81 19 0
 5 0 0 11819968 9298064 440 863 13209752927129 0 0 0 0 1 0 1 0 11219 158202 14557 80 20 0
 6 0 0 11819776 9297168 500 953 12490715053962 2 2 0 0 2 0 1 0 11777 157736 15521 80 20 0
oracle@perch:/files>> vmstat -p 5
     memory           page          executable      anonymous      filesystem
   swap  free  re  mf  fr  de  sr  epi  epo  epf  api  apo  apf  fpi  fpo  fpf
 7268296 5110360 1041 2719 753 0 0 11811407681 0 0  0    0    0 717270832025 753 753
 11820496 9292224 2055 2415 0 0 0    0    0    0    0    0    0 52087519922399 0 0
 11822032 9293648 2120 2163 5 0 0    0    0    0    0    0    0 55982485259240 5 5
 11824408 9296288 2414 2562 2 0 0    0    0    0    0    0    0 54414466418190 2 2
 11824328 9296576 2698 2719 0 0 0    0    0    0    0    0    0 52983811694559 0 0
 11822288 9293912 2332 1951 3 0 0    0    0    0    0    0    0 49271214261636 3 3
 11820448 9291312 2509 1115 0 0 0   18    0    0    0    0    0 48591187881736 0 0
 11817528 9294472 2480 871 2 0  0    0    0    0    0    0    0 54421309283685 2 2
 11820304 9304032 2429 860 8837 0 0 164187898623 0 0 0   0    0 47628172925439 8837 8837
 11814208 9413840 2067 860 29007 0 0 0    0    0    0    0    0 43272384005459 29009 29007
 11815744 9486424 2333 860 835 0 0   0    0    0    0    0    0 38585601084343 837 835

What area should we be looking into? Our swap space is being consumed steadily and reducing by the day. Not sure if it's because of runaway process.

Any opinion is appreciated.

Thanks in advance
ET.

So, for example, you have 13,209,752,927,129 page-in's in a 5 second period? That's a little bit more powerful than the systems I work with. Please tell us about your system. It's gotta be ccNUMA, but how many cpu's?

Well, since you have enough disks to supply 2.6 quadrillion pages in a second, you can afford some more swap space. So just add a few TB more swap.

OS: Solaris 10

I'm trying to look at why the swap space is being consumed consistently, and it isn't released back to the system. At this rate, it's just a matter of time the swap goes out again. Our bomb just exploded just week, and bring our oracle instance down due to out of /tmp forcing a restart of the instance which releases the memory.

Are you using ZFS?

Actually I was trying to be facetious. Numbers like that make me think that vmstat must be broken. If I ignore the pi and assume that the rest of the numbers are valid, I don't really see any problem. You have swap and free physical memory. Page-outs are low and so is the scan rate. So pi is impossible and everything else looks good. I guess I would look for a vmstat patch though.

But if swap is disappearing, do a "df -k". /tmp uses swap for sure and I think maybe /var/run or something like that does as well. Could /tmp be eating your swap area? If it's a program, the size as reported by ps would be growing over time. Also I often see people over-allocate shared memory to Oracle. So when Oracle is running I do a "ipcs -mb" to look at that.

Yes, these pi and fpi numbers can't be but bogus. 20000 TeraBytes per second is simply unrealistic.

Is your system up to date with patches ?

Especially one that fix a Veritas bug:

http://www.unix.com/sun-solaris/76511-huge-pi-vmstat.html\#post302224843

yea, that's what I'm looking at currently too. Vxfs 4.1 bug.

@reborg, it's running on vxfs. which may trigger the the high pi bogus numbers.

I have another concern is the disappearing swap space. I can see clearly that the free swap space is dropping by the day, but unable to nail down the culprit. any suggestion how should I approach it?

thanks for the views all.

What are "df -k", "ps -efly" and "ipcs -mb" output ?

output as follow. As for ps -elfy output the listing is very long, any particular process like oracle's to grep for?

You're looking for large processes. And you want to save the output. Run it each day. Compare all of the outputs and see if anything growing over time. You probably have a classic memory leak.