zombie

Hi,
Linux redhat 5.5

top shows that i have 20 zombie process :

Tasks: 357 total,   1 running, 336 sleeping,   0 stopped,  20 zombie
Cpu(s):  0.2%us,  0.3%sy,  0.0%ni, 99.5%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  24949400k total,  2363052k used, 22586348k free,   227084k buffers
Swap: 16383992k total,        0k used, 16383992k free,   453804k cached

Its a 24*7 server and i cant take it down.
How can i "kill" those 20 zombies ?

Thanks

Zombie processes, sometimes called defunct processes, are processes that have completed, but their parent process has not yet received status of their termination. It's not unusual to see a few zombies for a few seconds if the parent process is busy and has not issued a wait() system call to collect the status of a child/children. However, if the zombies persist (their PIDs are always the same), this is an indication that the parent application is either poorly coded, or wedged (looping). Given that your CPU usage is low, I'm guessing the parent(s) aren't wedged.

Zombie processes cannot be killed, as you've likely found out. The good news is that the only system resources they are taking is the slot in the process table; all other real resources (memory, sockets, open files, etc.) were closed/released when the process ended. The zombies only become a problem when their numbers start to "clog" the process table which might have a finite size.

If you have the option to stop and restart the parent, then your zombie processes will be cleaned up with the parent. If the parent is some service that you must keep running in order to prevent down-time, then you're stuck.

If you want to see which process(es) own the zombies, capture the output of a ps -elf (or ps -ajx on FreeBSD) and look for the zombies. For each zombie the parent process id (PPID) should be listed (column 3 usually) and that can be used to find it's parent (look for the process with the PPID you found listed as the PID -- usually column 2).

This illustrates the output from the ps command showing the process a.out having a defunct/zombie child process:

0 S scooter  31578  4537  0  84   4 -   406 -      15:29 pts/3    00:00:00 a.out
1 Z scooter  31579 31578  0  84   4 -     0 exit   15:29 pts/3    00:00:00 [a.out] <defunct>