Oracle memory usage on Solaris box

prabumohan · September 28, 2010, 12:50pm

I am working on Oracle 2 node RAC 10.2.0.4 on Solaris 10 T2000 kit.

The box has around 32G of memory of which 24G is used by oracle user. There is 3G of free memory on the box.
Sga max is set to 5G and while checking v$pgastat i see that maximum pga memory memory allocated was 6.5G. So oracle database is using around 12G of memory. There are no other processes running on the box except for the clusterware processes and the database.

NPROC USERNAME  SWAP   RSS MEMORY      TIME  CPU
  2958 oracle     30G   24G    75% 187:45:36 1.9%
    69 root      304M  420M   1.3%  19:54:52 0.1%
     1 daemon   4648K 8688K   0.0%   3:30:49 0.2%
     1 smmsp    1352K 8472K   0.0%   0:00:04 0.0%

There are around 2500 connections to this instance.
From the ps -ef OS command I see around 300 each of these connections.

/usr/bin/ssh -o FallBackToRsh=no -o PasswordAuthentication=no -o StrictHostKeyC
/oracle/product/cluster/jdk/jre/bin/sparcv9/java -classpath /or

When I check prstat for specific /oracle/product/cluster/jdk/jre/bin/sparcv9/java process I see that

PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 18696 oracle     77M   28M sleep   29   10   0:14:27 0.0% java/17

For /usr/bin/ssh process

PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  2756 oracle   6328K 3032K sleep   29   10   0:00:00 0.0% ssh/1

I am not sure about how the remaining 12G is used by oracle sessions?
How can I confirm the total memory usage by java process and the memory usage by clusterware?
Am I missing something here?

radoulov · September 28, 2010, 3:52pm

Some diagnostic tools (like prstat for example) are summing (incorrectly) the shared memory for every process attached to it.
So probably you have more free memory - use sar (-r) or vmstat (free) to check the memory usage.

prabumohan · September 29, 2010, 4:06am

sar -r

SunOS sunfire130 5.10 Generic_141414-07 sun4v    09/29/2010

00:00:00 freemem freeswap
00:10:00  379711  6365702
00:20:01  381464  6392244
00:30:00  370198  6201557
00:40:00  381115  6378368
00:50:00  380526  6374958
01:00:00  382979  6412045
01:10:00  380202  6374879
01:20:00  378190  6326520
01:30:01  380685  6361740
01:40:00  380966  6370946
01:50:00  379275  6349503
02:00:00  370904  6199362
02:10:00  377037  6310516
02:20:01  376730  6299128
02:30:00  377665  6307954
02:40:00  375837  6282499
02:50:00  373060  6250497
03:00:00  384821  6404191
03:10:00  396092  6567784
03:20:01  376387  6298590
03:30:00  353518  5895149
03:40:00  362947  6072349
03:50:00  362284  6060109
04:00:00  362085  6047641
04:10:00  359961  6006132
04:20:01  359425  6006543
04:30:00  358139  5998141
04:40:00  355569  5954072
04:50:00  353991  5926019
05:00:00  343458  5734082
05:10:00  352976  5902494
05:20:01  351096  5873102
05:30:00  348735  5838498
05:40:00  348238  5843592
05:50:00  345854  5798027
06:00:00  345263  5774854
06:10:01  355511  5948726
06:20:00  462428  8290025
06:30:00  572195 10711797
06:40:00  479550  8643372
06:50:00  415216  7203634
07:00:00  370971  6214658
07:10:00  366112  6108073
07:20:01  363548  6062562
07:30:00  372455  6188077
07:40:00  383153  6338427
07:50:00  381317  6322173

Average   378082  6374289

pagesize
8192

 vmstat 5
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m3 m4   in   sy   cs us sy id
 0 0 0 5882040 5128568 494 1837 98294413907 157 155 0 0 0 3 -0 -0 16458 30336 15693 4 3 94
 0 2 0 2979128 2867880 563 2089 218 1866 1866 0 0 0 0 0 0 19741 69288 19390 5 4 92

So seems like there is around 3G free memory which does match with the vmstat o/p

I am concerned about this issue as we did face memory contingency on this node when the other node failed over resulting in around 5000 sessions using this node.
Sessions were failing constantly due to memory issues, the issue was sorted only after fixing the other node. There was huge amount of swapping going on too.

sga max is set to 5G which is 1/6th of total RAM size. So this would mean more than 25G is consumed by PGA when failover occurs but when I check the PGA usage on both the instances it is around 6G per instance, so the combined usage is less than 12G and thats the reason I am confused.

So I am not sure whether there is some other clusterware process or java process that is consuming all the memory. From the ps -ef OS command I see around 300 each of these connections.

/usr/bin/ssh -o FallBackToRsh=no -o PasswordAuthentication=no -o StrictHostKeyC
/oracle/product/cluster/jdk/jre/bin/sparcv9/java -classpath /or

So I am not sure how to check the total memory consumption of these processes as I cannot sum them as this might give me a false value due to child processes involved. Is there any way for me to identify the actual memory usage of these processes?

radoulov · September 29, 2010, 4:57am

Could you provide more details - what are the exact error messages?

Which command you use to check if swapping occurs?

You don't have enough evidence for now.

And how do you check the PGA status?

Why do you care for child processes? What's the sum of the memory used by these processes?

prabumohan · September 29, 2010, 5:20am

We were receiving out of process memory errors - unable to allocate .. bytes of memory for PGA

vmstat - sr was increasing constantly and used oracle EM to monitor memory usage

Yes I cant comment on this now as I dont have facts for this.

I am an Oracle DBA with little knowledge on Unix. I can check within Oracle by checking v$pgastat , v$process etc

As I said earlier I see 259 java processes each consuming 26M RSS which comes to 6.7G.
There are same number of ssh processes each consuming around 2.5M which comes to 650M.
Is there any solaris command that will directly give the sum of memory used by all these processes?
I am using something like this to calculate the total memory usage of these processes

ps -ef | grep /oracle/product/cluster/jdk/jre/bin/sparcv9/java | wc -l
258
prstat -p 19053,25131,18696
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
 18696 oracle     77M   26M sleep   29   10   0:14:27 0.0% java/17
 25131 oracle     77M   26M sleep   29   10   0:14:19 0.0% java/17
 19053 oracle     77M   26M sleep   29   10   0:14:21 0.0% java/17
ps -ef | grep PasswordAuthentication=no | wc -l
259
prstat -p 2756, 8622, 2230
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  2756 oracle   6328K 2432K sleep   29   10   0:00:00 0.0% ssh/1
  8622 oracle   6328K 2504K sleep   29   10   0:00:00 0.0% ssh/1
  2230 oracle   6328K 2528K sleep   29   10   0:00:00 0.0% ssh/1

Based on the average RSS size I am multuiplying it by total processes to compute total memory usage of these processes, is this the right way of computing the memory usage of these processes?

radoulov · September 29, 2010, 5:53am

Could you provide the exact error code and message (ORA- ...)?

The scan rate is not enough, check if si/so have values different than 0
(or if pi/po have high values) with vmstat -S .

Could you post the output from:

set pages 200 lines 132
col name for a40
select * from v$pgastat;

Just sum the RSS values of all those processes:

ps -eorss,args |
  nawk 'END { 
    print s/1024, "MB"
    }
    /PasswordAuthentication=no/ {
      s += $1
      }'

---------- Post updated at 11:53 AM ---------- Previous update was at 11:42 AM ----------

Could you also post the output of the following command:

set pages 200 lines 132
select server, count(1) from v$session
group by server;

prabumohan · September 29, 2010, 6:43am

ORA-04030: out of process memory when trying to allocate 82456 bytes (pga heap,control file i/o buffer)

SQL> !vmstat -S
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  si  so pi po fr de sr m0 m1 m3 m4   in   sy   cs us sy id
 0 0 0 5864088 5114360 0  0 114122802779 166 164 0 0 1 3 -0 -0 16470 30371 15706 4 3 94

Number of sessions

SQL> set pages 200 lines 132
select server, count(1) from v$session
group by server;SQL>   2

SERVER      COUNT(1)
--------- ----------
DEDICATED       2538

PGA usage

NAME                                          VALUE UNIT
---------------------------------------- ---------- ------------
aggregate PGA target parameter           2097152000 bytes
aggregate PGA auto target                 131072000 bytes
global memory bound                       209715200 bytes
total PGA inuse                          3829838848 bytes
total PGA allocated                      6324478976 bytes
maximum PGA allocated                    6921444352 bytes
total freeable PGA memory                 641400832 bytes
process count                                  2561
max processes count                            2850
PGA memory freed back to OS              7.7713E+10 bytes
total PGA used for auto workareas           2102272 bytes
maximum PGA used for auto workareas       150141952 bytes
total PGA used for manual workareas               0 bytes
maximum PGA used for manual workareas        537600 bytes
over allocation count                        400469
bytes processed                          1.9377E+11 bytes
extra bytes read/written                 6706253824 bytes
cache hit percentage                          96.65 percent
recompute count (total)                      403688

19 rows selected.

Memory usage by java processes and ssh processes

ps -eorss,args | nawk 'END { print s/1024, "MB" } /ssh/ { s += $1 }'
658.383 MB
ps -eorss,args | nawk 'END { print s/1024, "MB" } /java/ { s += $1 }'
5800.34 MB

I did try checking the rest of the processes other than oracle db, asm, ssh, java and external connections

ps -eorss,args | grep -v oracleavt31 | grep -v java | grep -v PasswordAuthentication=no | grep -v oracleavt31 | grep -v asm | grep -v ora_ | nawk 'END { print s/1024, "MB" }  { s += $1 }'
1281.53 MB

So Oracle SGA+PGA comes to around 11G, ASM sga settings is 300M, ssh memory usage is 658M, java memory usage is 5.8G, other process memory is around 1.3G. So in total it comes to around 19G but still I am missing 5G of memory, not sure how this is being utilized?

prstat -a -s rss shows oracle user being using 24G of memory.

Also for a few processes the complete command is getting trimmed for example

ps -ef | grep PasswordAuthentication
oracle  2230  2190   0   Sep 16 ?           0:00 /usr/bin/ssh -o FallBackToRsh=no -o PasswordAuthentication=no -o StrictHostKeyC

and this is the same case when I check java processes too
Is there any way I can get to know the complete OS command being used without getting trimmed?

radoulov · September 29, 2010, 7:03am

You need far more PGA(>=6) than your current target(2G).
Bump it up to at least 8G.

Without options, vmstat displays a one-line summary
of the virtual memory activity since the system was booted.
You need at least vmstat -S 5 50 to get useful result.

Try to understand what those java processes are doing,
you can display more details with /usr/ucb/ps auxwww | fgrep java .

I already explained why prstat summary output should not be considered!

prabumohan · September 29, 2010, 7:17am

Thank you very much.

This pga_aggregate_target parameter which is set is just a target value isn't it, so even if it is small and if Oracle needs more memory it would go for it, is that not the case?
Do you still recommend setting PGA to around 8G, is this just to make sure Oracle reserves that memory so that no other process can make use of it when Oracle actually needs it?

Regarding Oracle processes, these are all the processes using SGA memory but I see different values for each process, so should I conclude they all will be using memory within the SGA space or will they be stretching outside the SGA memory?

2686896 ora_psp0_avt32
2715088 ora_lms1_avt32
3261872 ora_j002_avt32
2715336 ora_lms2_avt32
3260336 ora_pz99_avt32
3227800 ora_asmb_avt32
2714480 ora_pmon_avt32
3219688 ora_q000_avt32
2710496 ora_dbw3_avt32
2725976 ora_smon_avt32
2733528 ora_s000_avt32
2953304 ora_dmon_avt32
3235696 ora_arc1_avt32
2715216 ora_lms5_avt32
3235688 ora_arc0_avt32
2715256 ora_lms6_avt32
2715320 ora_lms4_avt32
2703896 ora_ckpt_avt32
2715096 ora_dbw0_avt32
2715248 ora_lms7_avt32
2715360 ora_lms0_avt32
3260368 ora_q001_avt32
2735736 ora_cjq0_avt32
2732736 ora_mmon_avt32
2710848 ora_lmon_avt32
2721312 ora_reco_avt32
2712400 ora_lck0_avt32
3211512 ora_qmnc_avt32
2705352 ora_lgwr_avt32
2622864 ora_dism_avt32
2722368 ora_mmnl_avt32
2692112 ora_d000_avt32
2707472 ora_mman_avt32
2710488 ora_dbw2_avt32
2715776 ora_lms3_avt32
2718952 ora_dbw1_avt32
2711952 ora_lmd0_avt32
3259624 ora_j004_avt32
4570520 ora_j005_avt32
3260064 ora_j003_avt32
3207096 ora_rbal_avt32
3262000 ora_j000_avt32
3206960 ora_insv_avt32
4571280 ora_j001_avt32
4570184 ora_j007_avt32
4579240 ora_pz98_avt32
4570712 ora_j006_avt32
4570632 ora_j008_avt32
4570824 ora_j009_avt32
4713888 ora_diag_avt32
4577648 ora_pz93_avt32
4576624 ora_pz92_avt32
4577560 ora_pz97_avt32
4577576 ora_pz94_avt32
4571656 ora_o002_avt32
4575800 ora_pz90_avt32
4577520 ora_pz95_avt32
4575800 ora_pz89_avt32
4584536 ora_emn0_avt32
4575800 ora_pz88_avt32
4577560 ora_pz96_avt32
4575928 ora_pz91_avt32

vmstat

vmstat -S 5 50
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  si  so pi po fr de sr m0 m1 m3 m4   in   sy   cs us sy id
 0 0 0 5855176 5107136 0  0 116114603935 168 166 0 0 1 3 -0 -0 16483 30378 15720 4 3 94
 0 0 0 2608536 2540264 0  0 124 69 65 0  0  0  0  0  0 17361 58626 16488 10 3 87
 0 0 0 2614104 2544112 0  0 13 22 22  0  0  0  0  0  0 16116 16941 15440 9 2 88
 0 0 0 2611808 2540504 0  0 1992 27 27 0 0  0  0  0  0 17254 18751 16453 9 2 89
 0 0 0 2609656 2538664 0  0 6808543396 27 27 0 0 1 0 0 0 16591 34676 15744 12 5 83
 0 0 0 2607216 2535224 0  0 550 19 19 0  0  0  0  0  0 18155 54435 17277 13 7 80
 0 0 0 2615648 2543304 0  0  0 19 17  0  0 11  0  0  0 16155 14767 15394 8 2 91
 0 0 0 2604184 2535800 0  0  5 79 79  0  0  0  0  0  0 17573 69001 16663 11 4 85
 0 0 0 2611160 2538176 0  0 6812339727 44 44 0 0 0 0 0 0 16395 17328 15547 11 2 87
 0 1 0 2614408 2542520 0  0 68131839621 16 16 0 0 0 0 0 0 17002 24987 16228 9 3 89
 0 0 0 2615720 2544024 0  0 6813183962 10 10 0 0 0 0 0 0 16178 15207 15384 9 2 89
 0 0 0 2615568 2544320 0  0  0 32 32  0  0  0  0  0  0 17183 17673 16274 9 2 88
 0 0 0 2616136 2544240 0  0  0 19 19  0  0 12  0  0  0 16114 14547 15347 8 2 91
 0 0 0 2615824 2544600 0  0  0 52 48  0  0  0  0  0  0 16909 43463 16170 9 2 88
 0 0 0 2609704 2538928 0  0  0 51 51  0  0  0  0  0  0 16059 33665 15071 10 3 87
 0 0 0 2615872 2543832 0  0  0 32 32  0  0  0  0  0  0 16753 17561 15976 8 2 90
 0 0 0 2614328 2542432 0  0 20437019104 17 17 0 0 0 0 0 0 17384 51736 16964 11 5 84
 0 0 0 2614536 2542352 0  0 6800541983 8 8 0 0 0 0 0 0 17110 22122 16306 10 3 87
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  si  so pi po fr de sr m0 m1 m3 m4   in   sy   cs us sy id
 0 0 0 2616296 2543568 0  0  0 13 13  0  0 10  0  0  0 16133 16474 15379 8 2 90
 0 0 0 2611760 2536872 0  0 13628056864 49 46 0 0 0 0 0 0 17087 30483 16141 10 5 85
 0 0 0 2611976 2536688 0  0  0 41 41  0  0  0  0  0  0 16444 57780 15615 9 3 88
 0 0 0 2615752 2539648 0  0  0 16 16  0  0  0  0  0  0 17011 25215 16246 9 3 88
 0 0 0 2616608 2540632 0  0  0  5  5  0  0  0  0  0  0 16541 15918 15687 9 2 89
 0 0 0 2616296 2540560 0  0  0 10  8  0  0  0  0  0  0 16883 17348 16141 9 2 89
 0 0 0 2616760 2541712 0  0  0  5  5  0  0  0  0  0  0 15824 14300 15109 8 2 91
 0 0 0 2616808 2543384 0  0 68123397012 24 22 0 0 0 0 0 0 16746 16781 15902 9 2 89
 0 0 0 2611800 2538968 0  0 165 44 44 0  0  0  0  0  0 16656 99192 15917 10 3 87
 0 0 0 2616400 2542888 0  0 6811073702 41 41 0 0 0 0 0 0 16702 19093 15879 8 2 89
 0 0 0 2615568 2541728 0  0 10  6  6  0  0  0  0  0  0 17296 53964 16710 12 6 82
 0 0 0 2616576 2541736 0  0 40876570832 8 8 0 0 0 0 0 0 16948 20027 16190 10 2 88
 0 0 0 2616000 2541648 0  0  0 13 13  0  0  0  0  0  0 15940 14917 15126 8 2 91
 0 0 0 2613592 2539904 0  0 457 11 11 0  0  1  0  0  0 17123 22652 16033 11 2 87
 0 0 0 2610776 2536480 0  0  0 47 47  0  0  0  0  0  0 16702 64147 15674 10 3 86
 0 0 0 2611488 2536424 0  0  2 38 38  0  0  0  0  0  0 17120 35360 15958 10 4 86
 0 0 0 2616784 2540584 0  0  0  0  0  0  0  0  0  0  0 16137 15592 15216 10 2 88
 0 0 0 2617640 2541408 0  0  0 32 32  0  0  0  0  0  0 17171 18370 16215 9 2 89
 0 0 0 2617568 2541704 0  0 6804750760 10 10 0 0 20 0 0 0 16780 18186 15833 8 2 90
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  si  so pi po fr de sr m0 m1 m3 m4   in   sy   cs us sy id
 0 0 0 2617584 2541880 0  0  0 17 16  0  0  0  0  0  0 16603 16552 15689 10 2 88
 0 0 0 2617336 2541912 0  0  0 29 29  0  0  0  0  0  0 16217 40843 15438 9 2 89
 0 0 0 2575912 2510144 0  0 166 55 55 0  0  0  0  0  0 16900 74873 16043 9 3 88
 0 0 0 2593848 2522744 0  0  0  6  6  0  0  0  0  0  0 17332 54147 16659 13 6 81
 0 0 0 2617928 2541632 0  0  0  5  5  0  0  0  0  0  0 16665 19349 15901 9 2 89
 0 0 0 2617704 2542160 0  0  0  3  3  0  0  9  0  0  0 16119 15255 15259 8 2 91
 0 0 0 2617744 2542272 0  0  0 14 14  0  0  0  0  0  0 16584 17990 15685 10 2 88
 0 1 0 2618048 2542344 0  0 13622147405 11 11 0 0 0 0 0 0 16125 15588 15364 8 2 90
 0 1 0 2601264 2538312 0  0  2 29 29  0  0  0  0  0  0 16964 65974 16176 9 3 88
 0 0 0 2616584 2541088 0  0  0 10 10  0  0  0  0  0  0 16269 15705 15327 10 2 88
 0 2 0 2617464 2541712 0  0 6811073702 29 29 0 0 0 0 0 0 17082 17984 16231 8 2 90
 0 1 0 2607328 2537256 0  0 105 3  3  0  0  0  0  0  0 19861 30992 19624 10 3 87
 0 2 0 2591376 2525472 0  0  0 11 11  0  0  0  0  0  0 17124 20198 16274 8 2 90

Its all similar java processes

   229  oracle   25479  0.0  0.17883226960 ?        S   Sep 16 15:14 /oracle/product/cluster/jdk/jre/bin/sparcv9/java -classpath /oracle/product/cluster/jdk/jre//lib/rt.jar:/oracle/product/cluster/jlib/cvu.jar:/oracle/product/cluster/jlib/srvm.jar -DCV_DESTLOC=/tmp -DCV_HOME=/oracle/product/cluster oracle.ops.verification.client.CluvfyDriver comp crs -display_status -n sunfire129,sunfire131,sunfire130

This is a Oracle clusterware process but not sure what this exact process is. Any suggestions?
I am also not sure why there are around 260 of these processes?

radoulov · September 29, 2010, 7:39am

Yes, it's true. But it's also true that it will be less efficient
when it doesn't succeed to acquire the needed memory in one pass.

No, I won't, given the output of vmstat ...
No, it's not about reserved memory, it's about more efficient overall memory management.

They surely go beyond the sga memory - an oracle process attaches to a shared memory segment (the SGA)
and it also needs a private memory (its portion of the PGA). So what you see is normal.

It appears that you have a very busy system.
Your system need more memory that it currently has,
so you could:

Try to tune the Oracle processes in order to reduce memory usage.
Try to understand and tune the non oracle processes memory usage.

Consider that the Solaris kernel will try to optimize the memory usage distributing the available memory among file system cache and other areas, so it's quite difficult to get an exact image of the used memory.

I believe that your system is not properly sized to support the current load on a single node (i.e. to be able to handle the current load on a single node, you'll need more than 32 GB of physical memory).

prabumohan · September 29, 2010, 8:45am

Thanks once again.

I am not sure why it should be so busy when SGA and PGA consumption are less than 11G at the moment. I do understand that PGA consumption would raise substantially during a failover but taking the present situation into account I have
11G of memory used by SGA+PGA
300M used by ASM
1.3G by other processes including root
6.5G used by cluster verify process(java) and ssh
3G free
which totals to 22.1G so still around 10G should be free, but not sure what is consuming that 10G?

I have identified one of the problems which is the cluster verify java process that has spawned 250 times and I dont see this process on any other cluster machines. This is only needed during installation so I am quite confused why is this one running now. I will dig around to see why this has happened.
The other problem is the missing 10G of memory and I don't have a clue how to get to the bottom of this one.
Any suggestions how I can check this out? Am I missing something here?

radoulov · September 29, 2010, 8:56am

Could you post the output of the following commands:

echo "::memstat" | mdb -k

Warning, the following command could be quite resource intensive:

kstat -m zfs

I'm just saying that the fact than you don't see that 10G as free memory, doesn't mean that it's not available (the kernel will allocate memory for file system caches and will release it as soon as some process needs it.

prabumohan · September 29, 2010, 11:10am

echo "::memstat" | mdb -k
mdb: failed to open /dev/mem: Permission denied

I dont have permissions as I am not a sysadmin

 kstat -m zfs
module: zfs                             instance: 0
name:   arcstats                        class:    misc
        c                               32529580032
        c_max                           32529580032
        c_min                           4066197504
        crtime                          240.281628246
        deleted                         0
        demand_data_hits                0
        demand_data_misses              0
        demand_metadata_hits            0
        demand_metadata_misses          0
        evict_skip                      0
        hash_chain_max                  0
        hash_chains                     0
        hash_collisions                 0
        hash_elements                   0
        hash_elements_max               0
        hdr_size                        0
        hits                            0
        l2_abort_lowmem                 0
        l2_cksum_bad                    0
        l2_evict_lock_retry             0
        l2_evict_reading                0
        l2_feeds                        0
        l2_free_on_write                0
        l2_hdr_size                     0
        l2_hits                         0
        l2_io_error                     0
        l2_misses                       0
        l2_rw_clash                     0
        l2_size                         0
        l2_writes_done                  0
        l2_writes_error                 0
        l2_writes_hdr_miss              0
        l2_writes_sent                  0
        memory_throttle_count           0
        mfu_ghost_hits                  0
        mfu_hits                        0
        misses                          0
        mru_ghost_hits                  0
        mru_hits                        0
        mutex_miss                      0
        p                               16264790016
        prefetch_data_hits              0
        prefetch_data_misses            0
        prefetch_metadata_hits          0
        prefetch_metadata_misses        0
        recycle_miss                    0
        size                            0
        snaptime                        1244341.11405159

module: zfs                             instance: 0
name:   vdev_cache_stats                class:    misc
        crtime                          240.281860426
        delegations                     0
        hits                            0
        misses                          0
        snaptime                        1244341.1205129

Let me know if this of any help.

radoulov · September 29, 2010, 3:44pm

Ask your sysadmin(s) to run the first command for you and send you the output.

This should give you an idea of the current memory usage of oracle user processes (assuming a single Oracle instance, some shared memory segments are missing):

ps -uoracle -opid= |
  xargs 2>/dev/null pmap -x |
    nawk 'END {
      for (p in priv) 
        printf "process: %s\n\t\tshared: %d KB\n\t\ttot private: %d KB\n", \
          p, shm[p], priv[p]        
      }
    /^[0-9]*:/ { 
      if (/ora/) {
        proc = "oracle"
        next
        }
      proc = $2 
      }
     /total/ {  
      $NF == "-" || shm[proc] = $NF
        priv[proc] += $5 
      }'

prabumohan · September 30, 2010, 7:02am

root@sunfire130 # echo "::memstat" | mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     572519              4472   14%
Anon                      3087015             24117   75%
Exec and libs               64314               502    2%
Page cache                   9661                75    0%
Free (cachelist)           340233              2658    8%
Free (freelist)             28226               220    1%

Total                     4101968             32046
Physical                  4096728             32005

sh memory.sh
process: -csh
                shared: 0 KB
                tot private: 632 KB
process: asm_lms0_+ASM2
                shared: 126984 KB
                tot private: 12448 KB
process: asm_dbw0_+ASM2
                shared: 126984 KB
                tot private: 2488 KB
process: asm_lmd0_+ASM2
                shared: 126984 KB
                tot private: 14040 KB
process: xargs
                shared: 0 KB
                tot private: 120 KB
process: asm_lck0_+ASM2
                shared: 126984 KB
                tot private: 3136 KB
process: asm_gmon_+ASM2
                shared: 126984 KB
                tot private: 3648 KB
process: asm_pmon_+ASM2
                shared: 126984 KB
                tot private: 2328 KB
process: sqlplus
                shared: 0 KB
                tot private: 3320 KB
process: asm_mman_+ASM2
                shared: 126984 KB
                tot private: 2024 KB
process: /usr/lib/ssh/sshd
                shared: 0 KB
                tot private: 552 KB
process: sh
                shared: 0 KB
                tot private: 152 KB
process: oracle
                shared: 126984 KB
                tot private: 10551136 KB
process: nawk
                shared: 0 KB
                tot private: 392 KB
process: asm_lmon_+ASM2
                shared: 126984 KB
                tot private: 4320 KB
process: asm_o000_+ASM2
                shared: 126984 KB
                tot private: 832 KB
process: asm_ckpt_+ASM2
                shared: 126984 KB
                tot private: 1728 KB
process: asm_psp0_+ASM2
                shared: 126984 KB
                tot private: 1192 KB
process: asm_rbal_+ASM2
                shared: 126984 KB
                tot private: 2880 KB
process: asm_lgwr_+ASM2
                shared: 126984 KB
                tot private: 2280 KB
process: asm_smon_+ASM2
                shared: 126984 KB
                tot private: 1696 KB
process: asm_diag_+ASM2
                shared: 126984 KB
                tot private: 2896 KB
process: /usr/bin/ssh
                shared: 0 KB
                tot private: 82136 KB

radoulov · September 30, 2010, 7:24am

Thanks.
Could you please post the output of the following modified command:

ps -uoracle -opid= |
  xargs 2>/dev/null pmap -x |
    nawk 'END {
      for (p in priv) 
        printf "process: %s\n\t\tshared: %d KB\n\t\ttot private: %d KB\n", \
          p, shm[p], priv[p]        
      }
    /^[0-9]*:/ { 
      if (/ora/) {
        proc = "oracle"
        next
        }
      if (/asm/)    
        proc = "asm"
        next        
      }
     /total/ {  
      $NF == "-" || shm[proc] = $NF
        priv[proc] += $5 
      }'

And this one:

ipcs -a

---------- Post updated at 01:24 PM ---------- Previous update was at 01:17 PM ----------

Just a quick recap:
the general memory drill down is clear:

Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     572519              4472   14%
Anon                      3087015             24117   75%
Exec and libs               64314               502    2%
Page cache                   9661                75    0%
Free (cachelist)           340233              2658    8%
Free (freelist)             28226               220    1%

Total                     4101968             32046
Physical                  4096728             32005

We have:

14% of memory used by the kernel
2% executables and libraries
~ 9% - free memory

Now we want to know where and how the remaining 75%/24GB of memory are used.

prabumohan · September 30, 2010, 9:58am

 ipcs -a
IPC status from <running system> as of Thursday September 30 12:48:07 GMT 2010
T         ID      KEY        MODE        OWNER    GROUP  CREATOR    CGROUP CBYTES  QNUM QBYTES LSPID LRPID   STIME    RTIME    CTIME
Message Queues:
q          0   0x61037083 --rw-------     root     root     root      root      0     0  65536  1013  1144 12:48:06 12:48:07  5:37:46
T         ID      KEY        MODE        OWNER    GROUP  CREATOR    CGROUP NATTCH      SEGSZ  CPID  LPID   ATIME    DTIME    CTIME
Shared Memory:
m         16   0xb2d23620 --rw-r-----   oracle oinstall   oracle  oinstall   2493 5251284992 14099 25818 12:47:53 12:47:53  5:48:42
m          1   0x2fe9b7f0 --rw-r-----   oracle oinstall   oracle  oinstall     18  130031616  3446 24816 12:47:14 12:47:15  5:39:26
T         ID      KEY        MODE        OWNER    GROUP  CREATOR   CGROUP NSEMS   OTIME    CTIME
Semaphores:
s         65   0x9ec866c4 --ra-r-----   oracle oinstall   oracle oinstall   312  5:49:02  5:48:44
s         64   0x9ec866c3 --ra-r-----   oracle oinstall   oracle oinstall   312 no-entry  5:48:44
s         63   0x9ec866c2 --ra-r-----   oracle oinstall   oracle oinstall   312 no-entry  5:48:44
s         62   0x9ec866c1 --ra-r-----   oracle oinstall   oracle oinstall   312 no-entry  5:48:44
s         61   0x9ec866c0 --ra-r-----   oracle oinstall   oracle oinstall   312 no-entry  5:48:44
s         60   0x9ec866bf --ra-r-----   oracle oinstall   oracle oinstall   312 no-entry  5:48:44
s         59   0x9ec866be --ra-r-----   oracle oinstall   oracle oinstall   312 no-entry  5:48:44
s         58   0x9ec866bd --ra-r-----   oracle oinstall   oracle oinstall   312  8:27:40  5:48:44
s         57   0x9ec866bc --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         56   0x9ec866bb --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         55   0x9ec866ba --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         54   0x9ec866b9 --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         53   0x9ec866b8 --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         52   0x9ec866b7 --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         51   0x9ec866b6 --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         50   0x9ec866b5 --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s         49   0x9ec866b4 --ra-r-----   oracle oinstall   oracle oinstall   312 12:48:07  5:48:44
s          5   0xe7e53e34 --ra-r-----   oracle oinstall   oracle oinstall    44 12:47:38  5:39:26
s          0   0x710371cd --ra-ra-ra-     root     root     root     root     1 12:38:11  5:37:38

sh memory.sh
process: oracle
                shared: 126984 KB
                tot private: 9954600 KB
process: asm
                shared: 126984 KB
                tot private: 57544 KB

radoulov · September 30, 2010, 10:34am

OK,
now I see circa 15GB (10 private, 5 shared) for Oracle (the script reports wrong result for the shared memory for the oracle processes, of course).
Could you confirm that the free memory is still 2.5GB ( vmstat -S 2 10 )?

If that's the case, you should check the memory used by other users too (with root ps -eopid= ... ).

---------- Post updated at 04:22 PM ---------- Previous update was at 04:15 PM ----------

I understand that actually now we are where we were when you first asked the question ...

I saw you posted the same question on the official Oracle forums, did you get some useful answers there (I cannot find the thread right now)?

---------- Post updated at 04:23 PM ---------- Previous update was at 04:22 PM ----------

Is the number of clusterware processes still so high?

---------- Post updated at 04:34 PM ---------- Previous update was at 04:23 PM ----------

And another one, could you confirm that the value of sga_max_size (not sga_target) is 5GB?

prabumohan · September 30, 2010, 10:35am

vmstat -S 2 10
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  si  so pi po fr de sr m0 m1 m3 m4   in   sy   cs us sy id
 0 0 0 5657056 4946968 0  0 179402309854 203 201 0 0 1 3 -0 -0 16487 30555 15737 4 3 94
 0 0 0 3713728 3290776 0  0 6494660262341 0 0 0 0 0 0 0 0 15368 25258 14641 5 2 93
 0 0 0 3709600 3285520 0  0 6246461904991 47 47 0 0 0 0 0 0 16429 114358 15839 5 2 92
 0 0 0 3713600 3286216 0  0 6205564249761 0 0 0 0 0 0 0 0 16672 16567 15848 4 2 93
 0 0 0 3713656 3283752 0  0 6188747002471 0 0 0 0 0 0 0 0 16082 14973 15388 4 2 94
 1 0 0 3711984 3279496 0  0 5920576609748 8 8 0 0 0 0 0 0 15907 15596 15172 5 2 93
 0 0 0 3683256 3279104 0  0 5778945759465 145 106 0 0 0 0 0 0 15537 46346 14897 5 2 94
 0 0 0 3699112 3264800 0  0 6081406690378 98 98 0 0 0 0 0 0 17116 183348 16591 6 4 90
 0 0 0 3712864 3281904 0  0 5933765549732 129 129 0 0 0 0 0 0 16671 68771 16118 6 4 90
 1 0 0 3713648 3283944 0  0 9696136469918 0 0 0 0 0 0 0 0 15815 24052 15085 5 2 93

Clusterware processes memory usage is still high. Think so this is some bug as I wouldn't have expected cluster verify processes to be running after the installation.

I didn't receive any help from oracle forums as expected as this is more Unix related question though very much tied up to Oracle.

radoulov · September 30, 2010, 10:37am

Could you also post the output of ps -falde , thanks!
The entire output.