No Space Left - Memory/Swap issue

:wall:I'm having a bit of a problem with Solaris 10u8 and one of our applications requesting memory and being told, "no space left".

The break down:
24GB Physical Memory
8GB swap

at the time of occurance, here's what a memory breakdown looks like:

Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     636499              4972   21%
ZFS File Data              268245              2095    9%
Anon                      1973955             15421   64%
Exec and libs               18347               143    1%
Page cache                  41447               323    1%
Free (cachelist)            30647               239    1%
Free (freelist)            115911               905    4%

Total                     3085051             24101
Physical                  3063472             23933
Filesystem            Size  Used Avail Use% Mounted on
swap                  789M  704K  789M   1% /tmp

Paging stats:
     memory           page          executable      anonymous      filesystem 
   swap  free  re  mf  fr  de  sr  epi  epo  epf  api  apo  apf  fpi  fpo  fpf
 3688208 1356288 285 1777 0 0   0    0    0    0    0    0    0    2    0    0
 762656 1185672 136 674 0   0   0    0    0    0    0    0    0    0    0    0
 757696 1180080 395 2615 0  0   0    0    0    0    0    0    0    0    0    0
 712176 1178880 353 3855 0  0   0    0    0    0    0    0    0    0    0    0
 566184 1168464 269 2736 0  0   0    0    0    0    0    0    0    0    0    0

Virtual Memory stats:
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr s0 s1 s2 --   in   sy   cs us sy id
 0 0 0 3688208 1356288 285 1777 2 0 0 0  0 10 10 -0  0 1868 341675 1735 5 5 90
 1 0 0 749288 1171536 165 1551 0 0 0  0  0  1  0  0  0 3000 590641 3019 12 6 82
 0 0 0 716488 1182048 177 2050 4 0 0  0  0  0  0  0  0 3700 470588 3925 11 6 83
 0 0 0 762504 1182488 93 558 0  0  0  0  0 124 184 0 0 4330 612541 4701 11 7 82
 0 0 0 762304 1181776 105 489 0 0  0  0  0 86 247 0  0 2730 604930 2779 9 4 87

As you can see, there is 900MB physical memory left and 1.1GB of swap free. There are around 1100 processes, all running as a spawned child of one process. I've looked at projmod for limitations, but haven't found any.

Any thoughts? I'm fresh out of new ideas...

Just add a couple of GB of swap space. You have as low as 566MB free, not 1.1GB.

Well, right I see that. But Solaris shouldn't be reporting "no space left" when, technically, there is. Am I right?

It depends how much memory your application is requesting.

@aychbee45: I'm afraid not. Solaris reports "no space left" when a reservation cannot be fulfilled. Depending on how much memory is requested, this can happen with a significant amount of free swap.

Each session is about 30-50MB (including shared libs) 10-20MB anon. We go from 1080 users to 1087 and the error starts happening.

---------- Post updated at 04:29 PM ---------- Previous update was at 04:29 PM ----------

Even if there is 900MB physical memory available as well?

You cannot think of available memory as "physical + swap". Swap is only used as backup in case of physical memory shortage, which Solaris requires for every running process. So to avoid errors you should have few GB of swap more.

Oh, I completely agree. But with almost 900MB free physical memory, why is this necessary?

Are you asking why swap is necessary? It is how this system works :slight_smile: It needs to be always sure that when available physical memory drops to zero, it will have disk space to send processes and their data to, to free it up. And from your vmstat output it seems that you may have real swap shortage. ~500MB of free swap is really not enough.

No, I completely agree. But why is the system not using any of the remaining physical memory? Also, shouldn't we be seeing scan reads?

I realize that part of physical memory is used for swap and then there is "disk" swap space. I'm just not understanding why our system isn't using the remaining 900MB of physical (or at least 500 of it, with the 500MB remaining swap).

How much free physical memory you have is irrelevant. Free memory is part of virtual memory so is backing reservation too. You are confusing memory reservation and usage.

Well, I didn't hear about using part of physical memory as swap, but main rule I learned while working with SunOS is that there should be always a bit more swap than physical memory available. I don't know exact mechanics behind memory management in Solaris though :slight_smile:

That all makes sense, I guess I just would have assumed a vmstat would have shown the system scanning for any unreserved memory to use.

From what I'm seeing, it's like my system has used all the swap it can, all the memory it can and the remaining is reserved. But, at that point I would think it would begin scanning to free memory. That's usually what I would expect.

That depends on what definition of swap you take. When swaps means virtual memory like in the statistics posted, it does.

Not necessarily. The rule is there must be enough virtual memory for all reservation to fit. The amount of memory needed is not related to the amount of RAM installed but is related to the number of applications and their requirements.

---------- Post updated at 00:23 ---------- Previous update was at 00:19 ----------

You are confusing physical and virtual memory. The page scanner (if this is what you are referring to) is there to scan used physical memory in case of demand to page it out. This mechanism has no relationship with virtual memory reservations.

1 Like

So in other words, my remaining physical memory and swap is spoken for. It may never be used, but it's spoken for.

Here is a pmap of our one of our processes:

 Address  Kbytes     RSS    Anon  Locked Mode   Mapped File
00010000   23248   22416       -       - r-x--  frmweb
016D2000     632     600     144       - rwx--  frmweb
01770000     576       -       -       - rwx--  frmweb
01800000     344       8       8       - rwx--  frmweb
01856000    7848    5072    3936       - rwx--    [ heap ]
FE920000      72      72      32       - rw--R  dev:256,65540 ino:33028
FE940000     136     136      48       - rw--R  dev:256,65540 ino:42199
FE970000      72      72      32       - rw--R  dev:256,65540 ino:33028
FE990000     240     240       -       - r-x--  libresolv.so.2
FE9DC000      16      16      16       - rwx--  libresolv.so.2
FE9F0000      16      16       -       - r-x--  nss_dns.so.1
FEA04000       8       8       8       - rwx--  nss_dns.so.1
FEA10000     408     408     120       - rw--R  dev:256,65540 ino:46165
FEA80000    1296     512       -       - r-x--  libX11.so.4
FEBD4000      24      24      16       - rwx--  libX11.so.4
FEBE0000       8       8       8       - rw---  libX11.so.4
FEC00000     336     256       -       - r-x--  libXt.so.4
FEC64000      24      24      24       - rwx--  libXt.so.4
FEC70000       8       8       8       - rw---  libXt.so.4
FEC80000    2096     872       -       - r-x--  libXm.so.4
FEE9C000      88      88      88       - rwx--  libXm.so.4
FEEB2000       8       -       -       - rwx--  libXm.so.4
FEEC0000      64      64      64       - rwx--    [ anon ]
FEEE0000      32      32       -       - r-x--  nss_files.so.1
FEEF8000       8       8       8       - rwx--  nss_files.so.1
FEF00000     584     584       -       - r-x--  libnsl.so.1
FEFA2000      40      40      40       - rwx--  libnsl.so.1
FEFAC000      24      16      16       - rwx--  libnsl.so.1
FEFC0000      64      64      64       - rwx--    [ anon ]
FEFE0000      64      16      16       - rwx--    [ anon ]
FF000000    1416    1120       -       - r-x--  libnnz10.so
FF170000     112     112      88       - rwx--  libnnz10.so
FF190000       8       8       8       - rwx--    [ anon ]
FF1A0000      48      48       -       - r-x--  libsocket.so.1
FF1BC000       8       8       8       - rwx--  libsocket.so.1
FF1D0000      16      16       -       - r-x--  libthread.so.1
FF1E0000       8       8       -       - r-x--  libdl.so.1
FF1F2000       8       8       8       - rwx--  libdl.so.1
FF200000    1216    1216       -       - r-x--  libc.so.1
FF330000      40      40      40       - rwx--  libc.so.1
FF33A000       8       8       8       - rwx--  libc.so.1
FF340000      24      16      16       - rwx--    [ anon ]
FF350000      96      40       -       - r-x--  libXext.so.0
FF370000       8       8       8       - rwx--    [ anon ]
FF378000       8       8       8       - rwx--  libXext.so.0
FF380000       8       -       -       - rw---  libXext.so.0
FF390000       8       8       8       - rwx--    [ anon ]
FF3A0000       8       8       -       - r-x--  libc_psr.so.1
FF3B0000     208     208       -       - r-x--  ld.so.1
FF3E8000       8       8       -       - rwxs-    [ anon ]
FF3F0000       8       8       -       - r--s-  dev:256,65539 ino:46641
FF3F4000       8       8       8       - rwx--  ld.so.1
FF3F6000       8       8       8       - rwx--  ld.so.1
FFBF0000      64      64      64       - rwx--    [ stack ]
-------- ------- ------- ------- -------
total Kb   41736   34664    4976       -

We have about 1090 of these at peak. Is there any way to show how much a process has reserved as well as actually consumed?

Precisely. Solaris doesn't overcommit memory so make sure any reservation (i.e. malloc) made is backed by free either physical or swap space.

A command the ouput you missed to post is:

swap -s

It gives the whole virtual memory usage.

Note that a process copying a large file to /tmp might also cause the "no space left" message you see, as tmpfs is backed by virtual memory.

1 Like

Here is a swap -s from a previous time we received the error.

total: 15836552k bytes allocated + 6354480k reserved = 22191032k used, 157288k available

I'm still baffled why we wouldn't see sr though.

What "sr" ?
You only have 157MB available here.

And the output of swap -l might be informative as well.

What kind of application are you running. Is it possible that you hit any kind of memory limit in the application (java?) ?