Memory leaks on compilations

Hello!

I've been struggling for not few hours with memory leaks on this
machine. I'm running linux 2.6.32-5-686, and the problem is as follows:

Some months ago, I have compiled kernel 2.6.33-2-686 without any issues
in this same machine. This week I have tried compiling GNUzilla Icecat
and the 2.6.35 kernel. But when compiling the kernel, when it is just
starting compiling, all real memory (2GB) fills, and when beginning to
swap, the machine becomes absolutely unresponsive. With Icecat, it
takes 15 minutes before this happens, and today I tried leaving it
working for almost 5 hours, unresponsive, before killing the machine.

I killed the machine almost 10 times this week with high I/O activity.
I've already tried changing vm.swapinness to 60, and
vm.overcommit_memory to 2, but nothing helps. Also disabled the memlock
available for audio applications, but nothing yet. Increased my swap to
10GB, but nothing.

So here goes the output of some commands:

# sysctl -a | grep vm. | sort
error: permission denied on key 'net.ipv4.route.flush'
error: permission denied on key 'net.ipv6.route.flush'
vm.block_dump = 0
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 20
vm.dirty_writeback_centisecs = 500
vm.drop_caches = 0
vm.highmem_is_dirtyable = 0
vm.hugepages_treat_as_movable = 0
vm.hugetlb_shm_group = 0
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 256    32    32
vm.max_map_count = 65530
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1
vm.min_free_kbytes = 3789
vm.mmap_min_addr = 65536
vm.nr_hugepages = 0
vm.nr_overcommit_hugepages = 0
vm.nr_pdflush_threads = 0
vm.oom_dump_tasks = 0
vm.oom_kill_allocating_task = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.panic_on_oom = 0
vm.percpu_pagelist_fraction = 0
vm.scan_unevictable_pages = 0
vm.stat_interval = 1
vm.swappiness = 60
vm.vdso_enabled = 1
vm.vfs_cache_pressure = 100
$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 16382
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) unlimited
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited
$ free -m
             total       used       free     shared    buffers cached
Mem:          1987        437       1549          0    51         192
-/+ buffers/cache:        193       1793
Swap:        10240          0      10240
# swapon -s
Filename				Type		Size	Used	Priority
/dev/sda1                               partition	10486776	0	-1

Your help will be very appreciated!
Teresa e Junior

I suggest you retry after booting another kernel .
Then try another compiler if you can, and in the order that you like .

Memory leaks often happend by the running prog. in here the compiler(buggy), but might be the kernel, very rare and usually because of a faulty driver .

If it still happends , then the sources are so tight that the compiler is hitting a real bug .

Regards

Hello, remi75!

I tried it with 3 different kernels: the Debian Kernel, the Zen Kernel, and the Grml Kernel. The output of "sysctl -a | grep vm. | sort" is from the Debian Kernel. The Zen Kernel sets swappiness to 0, but the Debian sets it to 60, so... and the Grml is similar to the Debian one.

I'd like to try, but don't know how to use other compiler. I'm compiling from Debian sources, so it is just "debuild -k$GPGKEY", and it calls the compiler probably from debian/rules.

If it is a faulty driver, how could I debug it (without freezing again the system and killing it)?

I don't think the problem is in the sources. I have compiled the Debian kernel a few months ago and hadn't noticed anything wrong.

I suspect of some problem in my partitions? Though when I succeeded in compiling the kernel, I had no swap at all (already tried "swapoff -a", and it just started killing many procs...)

Thanks for your help!
Teresa e Junior

Try debugging by using IBM Rational purify

Oh, I think I will have to compile it! :wink:

Thanks anyway!

Compile what ( the kernel ) ? most probably the compiler is alredy buggy !

do an apt-get update style thingy , if one of the packages to update is a compiler (gcc***) go ahead .

otherwise please give me more infos on your distro and gcc -V etc

Regards

---------- Post updated at 09:45 AM ---------- Previous update was at 09:44 AM ----------

debug what, the compiler ? it has to be a debug version of gcc besides
i doubt it the user (no effence) will able to ...

Regards

---------- Post updated at 10:27 AM ---------- Previous update was at 09:45 AM ----------

yes but its a definition of a bug - otherwise it will be called a nonworking prog :wink:

you compiler worked with sourceX.c but might not with sourY.c
thats why it would be called a bug ...

i am still waiting on your details .

regards

Hello! Here are some details.

$ gcc -v
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.4.5-2'
--with-bugurl=file:///usr/share/doc/gcc-4.4/README.Bugs
--enable-languages=c,c++,fortran,objc,obj-c++
--prefix=/usr --program-suffix=-4.4 --enable-shared --enable-multiarch
--enable-linker-build-id --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix
--with-gxx-include-dir=/usr/include/c++/4.4 --libdir=/usr/lib --enable-nls
--enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc
--enable-targets=all --with-arch-32=i586 --with-tune=generic
--enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu
--target=i486-linux-gnu
Thread model: posix
gcc version 4.4.5 (Debian 4.4.5-2)
uname -a
Linux localhost 2.6.35-grml #1 SMP PREEMPT Sat Sep 4 10:58:14 UTC 2010 i686 GNU/Linux

But I tried also with kernels 2.6.35-6.dmz.2-liquorix-686 and 2.6.32-5-686 from Debian

After talking to you I tried a couple of different things, including "ulimit -v NUMBER" and "ulimit -m NUMBER" before compiling, but these don't seem to be respected, but I got to kill it before it killed me, so the build log is attached. You will notice that right in the beginning of the build it goes mad.

This one also looks problematic, the BSD daemon doesn't look so evil:

         (__) 
         (oo) 
   /------\/ 
  / |    ||   
 *  /\---/\ 
    ~~   ~~   
...."Have you mooed today?"...

Any more information you might need, I'll be glad to give.

Best regards,
Teresa e Junior

I forgot to mention, I have done a dist-upgrade this morning, but still no results.

Accourding to the log, you are trying to compile the kernel ?

Your issue started with compiling icecat right ?

Fisrt, is your machine 64bit capable ?

..

ulimit has nothing to do with it :slight_smile: you would have had an out of memory message ...

but you can try to copile under root just in case .

and during this morning's upgrade, was gcc abd glibc upgraded ?

regards

---------- Post updated at 01:55 PM ---------- Previous update was at 01:49 PM ----------

yes a couple of more things:

plase check on icecat's doc, make sure you have all the right dependencies (gcc version/kernel/glibc...) some softwares require a certain version of compilers /kernles etc ...

and if u can, i log of compiling icecat.

regards

I can't compile Icecat nor the kernel.

Yes, but since alsa doesn't work right in the 64 kernel for me (don't ask me why), and I had to install a lot of 32 libs on it, I stayed with i686.

compile as root user? I think debuild uses fakeroot when needed.

gcc was, but I browsed Debian bugs and couldn't find anything related, and the changelogs don't show anything important.

"sudo apt-get build-dep icecat" installs everything. I had a look at the docs, they say libpango libpangoxft libpangoft2 libfreetype libxft libgtk2 libx11, which I have.

It is attached. This one I left for about five hours before killing the machine. The one from the kernel build I killed when started going mad.

Thanks for your help so far! I hope we'll get there!
Teresa e Junior

Is this a dedicated machine or a VPS?

(Sorry, I did not read the entire thread... VPS? Dedicated machine?)

ok, the 4.4.5 version of GCC must be buggy because on gnu.org its not even listed as a release .

can you please do an apt-cache search "^gcc*" or something, just to list available gcc versions that you can put on your machine .

not to get too personal :wink: what CPU do you have ? and the exact version of your distro ?

I see 2G of ram already .

regards

It's a dedicated machine

I could try the older 4.1.2-29 or 4.3.5-4, or the experimental 4.5.1-8

$ cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Pentium(R) Dual  CPU  T3400  @ 2.16GHz
stepping	: 13
cpu MHz		: 2161.400
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips	: 4324.91
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Pentium(R) Dual  CPU  T3400  @ 2.16GHz
stepping	: 13
cpu MHz		: 2161.400
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fdiv_bug	: no
hlt_bug		: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 10
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm lahf_lm
bogomips	: 4324.59
clflush size	: 64
cache_alignment	: 64
address sizes	: 36 bits physical, 48 bits virtual
power management:

My distro is a Debian squeeze netinst iso, which comes with just a bootstrap. (To be clearer, Debian sid/squeeze)

Best regards!
Teresa e Junior

from gnu.org here are the "official" releases
July 31, 2010GCC 4.5.1 has been released.May 22, 2010GCC 4.3.5 has been released.April 29, 2010GCC 4.4.4 has been released.April 14, 2010GCC 4.5.0 has been released.Sid an unstable almost as an experimental ...
does your repository includes one of these versions ?

4.3.5-4 in testing, and 4.5.1-8 in experimental. I'll try 4.3.5-4 now to see if it helps, and then 4.5.1-8 from experimental, but without enabling the experimental repositories. There is also 4.1.2-29, which already hit stable (maybe should be the wisest choice).

Actually, I find it more stable than testing!

Ok good luck :slight_smile: keep us informed if you 'd like .

Regards

I tried the 4.1.3 one, which is the more stable. Still no results... I suspect of some loaded kernel module (maybe a third-party) or some daemon...

It's difficult to debug: I left htop open while compiling, and it shows memory usage growing, but doesn't show which process is using this memory... also because the memory fills to fast, I have not time to see what's going on. The memory owned by gcc doesn't seem that big.

I have written a live system installer, and within some days I need to retest it. It means I'm actually forced to do a clean system install soon. So, maybe I'll get rid of this problem after that... < I'm just afraid this won't happen!

Thanks for all your help!
Teresa e Junior

hmm, ok

Can you please get us your lsmod ?

But if the memory gets maxed out only when you run a compiler, it can be only it .

I ll see what your command line (that generates the compiling command) does exacly when i got a minute (i am at work ...no linuxes under my hand), maybe it s calling another set of progs that might be buggy .

Thanx

2 Likes
$ lsmod
Module                  Size  Used by
ppp_deflate             3312  0 
zlib_deflate           19179  1 ppp_deflate
bsd_comp                4663  0 
ppp_async               6347  1 
crc_ccitt               1319  1 ppp_async
ppp_generic            21927  7 ppp_deflate,bsd_comp,ppp_async
slhc                    5247  1 ppp_generic
option                 12973  1 
usb_wwan                9505  1 option
usbserial              31470  4 option,usb_wwan
nls_utf8                1005  0 
isofs                  30044  0 
usb_storage            38770  0 
parport_pc             29263  0 
ppdev                   5282  0 
lp                      7629  0 
parport                29803  3 parport_pc,ppdev,lp
sco                     7323  2 
bnep                    9305  2 
rfcomm                 31365  0 
l2cap                  35610  6 bnep,rfcomm
bluetooth              49116  6 sco,bnep,rfcomm,l2cap
ipt_REJECT              1945  1 
ipt_LOG                 4426  5 
ipt_REDIRECT             929  1 
xt_owner                 886  2 
iptable_nat             3555  1 
xt_limit                1266  7 
xt_tcpudp               1895  10 
ipt_addrtype            1483  4 
xt_state                 918  7 
ip6table_filter         1042  1 
ip6_tables             11372  1 ip6table_filter
ipv6                  264324  22 
nf_nat_irc              1168  0 
nf_conntrack_irc        3444  1 nf_nat_irc
nf_nat_ftp              1398  0 
nf_nat                 15739  4 ipt_REDIRECT,iptable_nat,nf_nat_irc,nf_nat_ftp
nf_conntrack_ipv4      10006  10 iptable_nat,nf_nat
nf_defrag_ipv4          1053  1 nf_conntrack_ipv4
nf_conntrack_ftp        5521  1 nf_nat_ftp
nf_conntrack           61885  8 iptable_nat,xt_state,nf_nat_irc,nf_conntrack_irc,nf_nat_ftp,nf_nat,nf_conntrack_ipv4,nf_conntrack_ftp
iptable_filter          1102  1 
ip_tables              10133  2 iptable_nat,iptable_filter
x_tables               15028  13 ipt_REJECT,ipt_LOG,ipt_REDIRECT,xt_owner,iptable_nat,xt_limit,xt_tcpudp,ipt_addrtype,xt_state,ip6table_filter,ip6_tables,iptable_filter,ip_tables
fuse                   61261  3 
uvcvideo               54152  0 
videodev               42308  1 uvcvideo
rtc_cmos                8151  0 
v4l1_compat            12753  2 uvcvideo,videodev
rtc_core               13327  1 rtc_cmos
joydev                  8594  0 
arc4                    1101  2 
rtc_lib                 2197  1 rtc_core
jmb38x_ms               7222  0 
ecb                     1595  2 
snd_hda_codec_conexant    28498  1 
snd_hda_codec_intelhdmi     9286  1 
snd_hda_intel          20423  2 
snd_hda_codec          78538  3 snd_hda_codec_conexant,snd_hda_codec_intelhdmi,snd_hda_intel
iwlagn                136564  0 
snd_hwdep               5200  1 snd_hda_codec
snd_pcm_oss            33067  0 
snd_mixer_oss          12856  1 snd_pcm_oss
iwlcore                92738  1 iwlagn
snd_pcm                66760  3 snd_hda_intel,snd_hda_codec,snd_pcm_oss
tpm_tis                 7338  0 
mac80211              178519  2 iwlagn,iwlcore
snd_timer              17635  1 snd_pcm
snd                    51309  12 snd_hda_codec_conexant,snd_hda_intel,snd_hda_codec,snd_hwdep,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer
soundcore               6331  1 snd
snd_page_alloc          6896  2 snd_hda_intel,snd_pcm
processor              27642  2 
cfg80211              138040  3 iwlagn,iwlcore,mac80211
rfkill                 15268  3 bluetooth,cfg80211
psmouse                50426  0 
evdev                   7092  21 
tpm                    13037  1 tpm_tis
tpm_bios                4705  1 tpm
memstick                7382  1 jmb38x_ms
serio_raw               3640  0 
ac                      2394  0 
battery                 8142  0 
ext4                  236068  2 
mbcache                 5918  1 ext4
jbd2                   53697  1 ext4
crc16                   1319  2 l2cap,ext4
sg                     19038  0 
i915                  262539  2 
sd_mod                 35317  4 
sr_mod                 13352  0 
cdrom                  33344  1 sr_mod
drm_kms_helper         29051  1 i915
drm                   160750  3 i915,drm_kms_helper
ahci                   18537  0 
libahci                19561  4 ahci
sdhci_pci               6046  0 
sdhci                  14882  1 sdhci_pci
libata                168958  2 ahci,libahci
tg3                   118266  0 
mmc_core               57329  1 sdhci
i2c_algo_bit            4736  1 i915
uhci_hcd               19803  0 
thermal                10464  0 
intel_agp              26929  2 i915
agpgart                29779  2 drm,intel_agp
led_class               2505  1 sdhci
ehci_hcd               34294  0 
i2c_core               20464  5 videodev,i915,drm_kms_helper,drm,i2c_algo_bit
video                  16881  1 i915
output                  1691  1 video
button                  4406  1 i915

Thank you!
Teresa e Junior

:slight_smile: a very busy server , nice :slight_smile: (adsl/atm , firewall ...wireless... :slight_smile: )
anyway, i love challeges, if you have enought patience we could try another thing :
compile under root please .
before lunching the compilation, can you throw in a loop in another shell / terminal that dumps a ps -edf or ps aux to a file , and at the end of the script (the loop) do a sync , etc ...

like

while true
do
ps -edf > /tmp/mylog.txt
sync
sleep 1
done

there is another like sleep command that takes shorter delays than a second, if u can find it , a half second would be even better :slight_smile:
so when the server crashes, the file should be on disk,
I know this is hardcore but hell , that's how we discover things, and as you said you might reinstall it anyway .
after the crash (sorry !!! ) that file will contain the answer :slight_smile:

2 Likes

Finally, unbelievable! This is what I love about UNIX systems. You're never hopeless!

So, I've done all the steps, but the problem was actually an endless call of a make option!

I have a script which calls debuild and gives it the `-j' option, then debuild calls dpkg-buildpackage and gives it the same option again, then dpkg-buildpackage calls make and gives it the same option...

I was using -j2, and had already tried changing it, but I ended unsetting it, so it was being called without arguments. From `man make':

"make will not limit the number of jobs that can run simultaneously", very enough!.

Also, strange that when called as -j2, it was enough for it to go nuts. Without the option -j, memory usage is relatively low!
See: Gentoo Forums :: View topic - Why does my system freeze when 1/2 RAM is used?
When compiling icecat, memory usage reached 7GB!!! I still believe this is a bug in make.

EDIT: my intention was using both CPU cores to compile faster, so does it mean it is not possible?

Thank you so much!
Teresa e Junior

1 Like