Hi
We have 2 identical T4-1's running Solaris 10 8/11 patched to 07/2012.
Both have 8G of swap allocated on the zfs root pool however a swap -s on one server shows 8G of swap available but on the other shows between 60 and 115G of swap available.
Both servers have the same amount of memory, 128G. We are having some Oracle related performance issues on the server showing 60-115G available swap and we are concerned that some type of corruption has occurred.
Thanks
Greg
8 GB of swap space for a server with 128 GB of RAM looks undersized. What makes you suspect a corruption has occured ?
The issue only seems to have occurred in the last few days. Until that time both servers showed the same swap figures.
It looks as if someone has taken your point and allocated all available memory to swap. I'm just wondering if :
a) that is posible
b) how it ws done
If the system has not been altered then why do the two identical servers show such differing swap figures? Unless some corruption has occurred.
Please post the output of these commands on both servers:
swap -s
swap -l
vmstat 2 2
prstat -Z -c 1,10 1 1
echo ::memstat | mdb -k
df -n swap -h
Which one of the swap figures both servers were showing before the event ?
There are plenty of reasons that could explain why identical hardware show different swap figures.
From the server showing the issue:
# swap -s
total: 10964072k bytes allocated + 185592k reserved = 11149664k used, 38735512k available
# swap -l
swapfile dev swaplo blocks free
/dev/zvol/dsk/rpool/swap 256,1 16 16777200 16777200
# vmstat 2 2
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s4 s5 s6 s7 in sy cs us sy id
0 0 0 81782944 96140160 21 82 0 0 0 0 3 -0 1 -0 1 1588 664 1387 0 0 100
0 0 0 38734832 53113984 1 11 0 0 0 0 0 0 0 0 0 1936 1629 2075 0 0 100
# prstat -Z -n 1,10 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
1822 oracle 6087M 6079M sleep 59 0 0:13:22 0.0% oracle/1
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
0 128 11G 11G 8.4% 0:34:03 0.0% global
Total: 128 processes, 3333 lwps, load averages: 0.04, 0.05, 0.04
# echo "::memstat" | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 319203 2493 2%
ZFS File Data 8056420 62940 49%
Anon 1372651 10723 8%
Exec and libs 26078 203 0%
Page cache 13682 106 0%
Free (cachelist) 9417 73 0%
Free (freelist) 6627062 51773 40%
Total 16424513 128316
Physical 16410049 128203
#df -h -n swap
Filesystem size used avail capacity Mounted on
swap 37G 40K 37G 1% /var/run
From the second server
# swap -s
total: 3501648k bytes allocated + 283192k reserved = 3784840k used, 9495632k available
# swap -l
swapfile dev swaplo blocks free
/dev/zvol/dsk/rpool/swap 256,1 16 16777200 16777200
# vmstat 2 2
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s4 s5 s6 s7 in sy cs us sy id
0 0 0 16223304 24564312 17 77 0 0 0 0 0 -2 3 -4 3 2322 1816 2135 0 0 100
0 0 0 9494752 17840096 99 204 0 0 0 0 0 0 0 0 0 2356 2744 2699 0 0 100
# prstat -Z -n 1,10 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
9613 oracle 891M 876M sleep 59 0 0:50:30 0.2% oracle/1
ZONEID NPROC SWAP RSS MEMORY TIME CPU ZONE
0 148 3652M 3684M 2.8% 7:01:36 0.2% global
Total: 148 processes, 3433 lwps, load averages: 0.18, 0.18, 0.18
# echo "::memstat" | mdb -k
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 743484 5808 5%
ZFS File Data 12976767 101380 79%
Anon 440277 3439 3%
Exec and libs 22564 176 0%
Page cache 14489 113 0%
Free (cachelist) 14968 116 0%
Free (freelist) 2211964 17280 13%
Total 16424513 128316
Physical 16410051 128203
# df -hn swap
Filesystem size used avail capacity Mounted on
swap 9.1G 56K 9.1G 1% /var/run
Both servers were showing the second set of output prior to Sunday last week.
Hope this helps
At first sight, the main difference is the memory used by the most active process on server #1 (oracle) which is around seven times higher than on server #2. (~6G vs 900M).
ZFS File Data 8056420 62940 49%
ZFS File Data 12976767 101380 79%
Unless your DB is stored on a zpool, you should limit the amount of ram used by ZFS to something more reasonable. On our production servers with Oracle 10 installed on the local zpool, I use a 4g zfs memory limit. This is because the SGA in Oracle can benefit from the memory more than the filesystem can.
If you are storing your DB on a ZFS pool, this advise will not apply as long as your DB's SGA is setup correctly.