Monitoring Paging and Swapping

javanoob · October 15, 2018, 3:10pm

Hi all,

This might sound silly but i am trying to determine if i have sufficient memory or not.
My definition of sufficient memory = no swapping + no paging to physical swap file.

I know i can use vmstat to monitor swapping and paging and using the SR column as well.

But wouldn't it be extremely direct if i were to just use

swap -l

and make sure my "free" = "blocks"

I am sorry if i am over-simplifying things...

p.s. i am on Solaris 11.3 SRU 32.4

Regards
Noob

hicksd8 · October 15, 2018, 3:32pm

Take a look at this thread paying particular attention to jlliagre's post.

MadeInGermany · October 16, 2018, 1:54am

Swapping unused stuff is okay. So swap -l is no good.
Frequent/continued swapping is not good. So vmstat measurement with a large interval makes sense.
--
The ZFS ARC cache has been found too aggressive. Should be limited in /etc/system.
See this article

bakunin · October 16, 2018, 4:40am

Notice that, depending on which OS and which set of tuning parameters you use, there are two possible swap strategies: early swap allocation and late swap allocation. Late swap allocation means that swap is only used when the physical memory runs out. Early swap allocation means that as soon as a program is started as much swap is allocated as it might use once it is indeed swapped out. In such a case you would see paging activity and swap allocation immediately even if there is no swap really being used yet.

Late swap allocation is used predominantly these days but, for instance, in AIX prior to version 5.1 early swap allocation was the default. One regularly saw swap usage of 70%-80% even if the system had sufficient RAM installed. Only using vmstat would then tell you if you are in trouble or not.

For a long-term monitoring you can use vmstat with a high interval but you also can configure sar to tailor it to your needs (you may want to get some other usage statistics from it too). See the man page of sar for details.

I hope this helps.

bakunin

jlliagre · October 16, 2018, 5:55am

Solaris never swap (i.e. swap out a whole process memory) unless there is a severe shortage of RAM.

I agree significant pagination can strongly degrade performance but paginating once no more used pages (e.g. from some unused files written in /tmp or any tmpfs based file system) improves performance compared to keeping them in RAM.

The fact some of the swap area is used is not necessarily a symptom of RAM shortage.

javanoob · October 16, 2018, 9:02am

Hi all,

Thank you for all the feedback.
Please pardon me for my ignorance.

1) I am seeing recommendations of doing vmstat with a large interval - why do we need do use a large interval ? How is that different from doing multiple counts with short intervals ?
.e.g. vmstat 10 2 vs vmstat 1 10 ?

2) When there is a physical memory shortage, and memory need to be page out, will this piece of memory that is paged out will resides on the physical swap volume ?
If so, why isn't swap -l an accurate way of saying if there is a memory shortage ?

is it because sometimes paging unused memory out is healthy and necessary -> so we cannot just determine that memory is insufficient base on physical swap-space being used - is my understanding correct ?

3) However, can we also directly say, if my swap -l always display 'free' = 'blocks', it definitely means i have sufficient memory, because nothing is paged nor swaped to physical swap - right ?

Regards,
Noob

jlliagre · October 16, 2018, 9:57am

Some statistics are absolute, in that case a short interval will allow to show transient events that would be missed otherwise. Some other statistics are counters, and vmstat will show their average changing rate. In that case, whatever the interval, the average will be correct but of course you would still miss variations in that rate with a long sampling interval.

This is true for anonymous pages, but pages that are backed by files would be paged out elsewhere, or just dropped if unchanged.

There is no clear definition of what a RAM shortage is. Beware not to confuse RAM and (virtual) memory. You can have a memory shortage with the swap area untouched and plenty of RAM reported to be free. On the opposite, your RAM might be undersized even while the swap area is untouched.

It is never strictly necessary, but a good idea for the system to free resources wasted otherwise.

Yes, although as I previously wrote, performance might be better with more RAM in that case.

bakunin · October 16, 2018, 10:16am

vmstat 10 2 will monitor an interval of 20 seconds, whereas vmstat 1 10 will only cover 10 seconds. Both are not sensible invocations for long-term monitoring. Try vmstat 600 (every 10 minutes) or something like that. The reason why you want a long interval is that you are not interested in any "spike" but the average usage of a system. Further, every polling of the stats also places a (slight but non-zero) load onto the system. You don't want your monitoring system to create the bottleneck it is intended to prevent.

Yes, correct

You don't want to avoid swap usage you want to avoid swapping activity. Something being in the swap doesn't hurt - putting it there and getting it back into memory again is what hurts. Therefore looking at a number which only tells you how much swap is used doesn't tell you what you really want to know.

Sometimes this is the case - it might be better overall if the damage you take by swapping something out is outweighed by the gain you may get from the memory freed that way which can be used otherwise. In general (but notice: "in general" doesn't mean "always") it makes sense to size a system in a way that swapping doesn't take place in normal operation. Swap is generally a "plan B", not the "plan A".

Yes. But in a professional environment it makes sense to size a system sufficiently. Memory does not come free of charge and it pays (or rather - saves) to give a system as much as it needs - but not more. It is the job of a systems administrator to walk that fine line between oversized (=wasting money) and undersized (=hurting the purpose).

You might want to read this little introduction to performance tuning and monitoring for more details about swapping.

I hope that helps.

bakunin

javanoob · October 16, 2018, 2:20pm

Hi both (Jilliagre and Bakunin) and all,

Thank you so much for your advice and guidance. I can't see where i could have possibly understand more if not here.
Thanks for the link on the introduction to performance tuning and monitoring as well.

It seems like understanding the OS kernel and its working is slowly like a lost trade; people no longer care how it works, people just want things to work..

Thanks for staying around to keep the forum alive, we need you guys around..

=====================================================================

On a side note on ZFS ARC, i do realize it is sucking up RAM and its not reflected as freemem in neither vmstat or sar.

As mentioned my MadeInGermany, i do set a reserve hint -> this kinda set a upper limit on how much ZFS ARC can grow up to.

But how do i know if the amount of memory available and allocated to ZFS ARC is sufficient or not ?

Regards,
Noob

jlliagre · October 16, 2018, 4:49pm

Check the cache hit ratio to see how efficient it is. The closer to 100%, the better.

If you can fit all the files you use in RAM (hit rate 100%) and still have room enough for the applications and other consumers RAM usage, go for it and set the ZFS ARC size to be able to host everything.

On the opposite and more likely situation, that's a trade off and you should measure usage, do tests that last a few days or more, see how your RAM is used, see how performance is impacted by giving more memory to the cache or more memory to the applications, and act accordingly.

javanoob · October 17, 2018, 1:31pm

Hi Jilliagre,

Sorry for asking.. where/how do i see the cache hit for ZFS ARC...

Regards,
Noob

jlliagre · October 17, 2018, 7:31pm

That's a good question. You can compute it from kstat values, e.g.:

kstat -p "::arcstats:demand_*data*" 10 10

The hit ratio is equal to 100.*xxx_hits/(xxx_hits+xxx_misses) with xxx being demand_data or demand_metadata

javanoob · December 13, 2018, 1:45am

Hi all,

I am sorry to bring up this thread again.
Recently, i have same heavy swap usage against the physical swap space despite having physical free ram.

I read up this old thread > Out of swap but RAM available >> and my understanding is that memory reservation will eat up swap-space if there isn't enough virtual swap for reservation ?

But again, read these directly from Solaris (Doc ID 1010585.1) in Oracle metalink somehow indicate that memory reservation against physical memory or swap doesn't actually take up the physical space.

When a process calls malloc()/sbrk() only virtual swap is reserved.
Reservation is done against the physical disk swap first.
If that is exhausted or not configured then reservation is done against physical memory. If both are exhausted then malloc() fails

As discussed, when measuring virtual swap available for reservation, consider monitoring vmstat ("swap" column) or swap -s (the "available" value).
Free memory as reported by vmstat ("free" column) or swap usage as reported by swap -l ("free" column) is unrelated with virtual swap available for reservation.

When allocating swap reservation from memory, there is no memory deducted and vmstat continues to show same amount of "free" memory.
Similarly, when a swap reservation is made from physical disk-based swap, swap -l will continue to show the same amount of "free" swap.

Free memory is only deducted due to a page fault and free swap is deducted during a memory shortage, when data needs to be migrated from the physical memory to physical disk swap to maintain a sufficient supply of free memory.

So does memory reservation actually take up swap space or not ?

Regards,
Noob

jlliagre · December 13, 2018, 5:27am

Both the linked thread and MOS Document are correct.

vmstat is reporting that there is unused RAM and it is true. There is no data stored on that RAM.

malloc fails because there is no more swap or RAM available. All of it is either used or unused but reserved.

Let me try a metaphor: if you enter a restaurant and see many empty tables, that doesn't mean you can sit on any of them. They might all be reserved by customers that aren't there yet (and maybe some will never show up).

javanoob · December 13, 2018, 12:36pm

Hi Jilliagre,

Thanks for your reply and appreciate the suiting metaphor used.

I think the point I am trying to confirm is

a) when doing memory reservation on virtual swap, the "physical space" that is reserved on the ram/swap device is not physically allocated right ? (still show as free in vmstat and swap -l)

b) can reserved but unused memory/swap (aka virtual swap) still be use for actual physical swapping ? (or it is reserved and no longer available for other usage)

c) if reserved swap isn't reflected as used space in swap -l, does this means i am really having actual swapping if i am seeing 2GB of physical swap space being used in swap -l ?
But again, I still has free memory - when doing echo ::memstat | mdb -k, will "reserved" but unused ram, shows up in "Free" ?

Regards,
Noob

MadeInGermany · December 13, 2018, 4:05pm

a) yes, reserved is only reserved. A reservation only has an impact on further reservations. The OS can overcommit reservations, i.e. allow more than it could ever use.
b) the actual usage of swap is not limited by reservations.
c) yes, if "swap -l" reports 2GB swap then 2GB of data was moved from RAM to disk. Using that again would require a copy to RAM first. The OS uses a LRU scheme i.e. picks the Least Recently Used RAM for "archiving" in swap (on disk).

jlliagre · December 13, 2018, 4:54pm

Reservation is a "logical" operation, there is no such thing as a "physical allocation". The only physical operations with memory are read and write.

The physical space that is reserved (thus unused) on RAM or swap area shows as free RAM and free swap.

Hmm, yes. Unused memory can be used (hopefully), but then, it is no more reported as reserved but as used. (That sounds logical, doesn't it?)

If you see 2GB used, that means there are 2GB of data stored there. There has been some swapping (actually pagination) for these 2G to be written.

It might indeed. memstat is showing physical memory usage, it has no idea about logical, virtual one.

--- Post updated at 22:54 ---

That's not the case with Solaris with which malloc (or similar) will fail if no swap or RAM is available to back it.
Unlike with Linux, memory overcommiting can only happen on Solaris when explicitely requested by the (rare) applications using MAP_NORESERVE mmap.

javanoob · December 14, 2018, 9:03am

Hi Jlliagre , MadeInGermany,

Thank you both for your reply and insight.

q1) Do you mean if ProcessA has made reservation on the virtual swap (ram & disk), another Process B can actually use this reserved space for actual physical paging into RAM and out to swap/disk ?

q2) If i still have free memory (ard 10G reported in memstat's free) but have physical swap being used (ard 3G in swap -l), what does this means then ?
Since reserved but free memory can be use by all, why is physical swap still being use ?

Doing vmstat for 30 minutes, did not see any counters jump for si/so, pi/po, and SR (all 0s) ?
Could i have some kind of spike @ certain timing (using up all the free ram), and cause the physical swap to be use ? < could this be 1 possiblity ?

Regards,
Noob

jlliagre · December 14, 2018, 9:28am

There's an extra "i", my pseudo is jlliagre, not jilliagre...

Unless shared memory enter in the game, no it can't.
Virtual space allocated by a process is for its exclusive use. Note again that virtual memory is not the same as physical memory. No particular space is reserved, that's just an amount of space. The OS is free to map this virtual space to whatever physical backend it likes.

That means the OS stored 3G of seldom used virtual memory to disk, improving overall performance.

The si/so counters are unlikely to move under normal circumstances but only during severe RAM shortage.
The pi/po counters move when there pagination occur. Most of that pagination might be unrelated to VM shortage, that might simply be a file being read by some process.
Th sr only move if there is RAM shortage.

A static 3G doesn't cause any activity by itself.

The most important counter to monitor is "sr". As long as it stays equal to 0, there is no RAM shortage.
You might have VM shortage though, and the vmstat column to monitor is then "swap" (available swap space).

Note also that your system might use zones configured with virtual memory capping. In such case you might have swap usage without (global) vm shortage.

javanoob · December 14, 2018, 12:32pm

Hi Jlliagre,

I am so sorry for that extra "i". Eyes playing tricks on me..and thank you for your reply.

Sorry if i am getting confused again..

Virtual swap = ram + disk. (shown in swap -s)

I understand that memory reservation on virtual swap does not actually consume physical space
From the previous thread, i understood that reserved virtual swap can be use for actual swapping/paging

But you mentioned that virtual space allocated by a process is for its exclusive use.. --> is reserved = allocated ?
I am actually referring to if virtual swap being reserved by a processA can still has the underlying unused physical ram/disk swap used by another ProcessB right ?

E.g.

T1) Process A reserved 5G of virtual swap using malloc ( (i) 3G from swap disk and (ii) 2G from ram ) -> again at this point, nothing is being physically allocated or used (ram and swap still shown as free in vmstat and swap -l)
T2) Existing Process B needs to read data into the physical RAM (page fault) -> it is actually able to put this piece of data into the (ii) 2G physical ram reserved by Process A - since the reservation is just done virtual swap
T3) OS also see the needs to page some data out of the physical ram -> it is also able to page out this data in ram out onto the (i)3G disk swapright ?
However, if a physical ram is already used/allocated to processA, no other process can use it.

Is my understanding correct ?

Regards,
Noob