Server Hanging frequently - urgent help required

micku_100 · November 28, 2007, 10:10pm

hi,

I am facing a sever problem in mu HO-UX 11.11.0 Server which is getting hanged for last 3/4 days.

Its giveing error insufficient memory. Earlier this problem was not there.
We are using oour ERP & informix on the same. There no heavy & sudden increase of data.

When I checked the output of "kmeminfo" it was as follows:
# ./kmeminfo
tool: kmeminfo 7.18 - libp4 9.295 - HP CONFIDENTIAL
unix: /stand/vmunix 11.11 64bit PA2.0 on host "bslddr"
core: /dev/kmem live
link: Fri Aug 5 18:18:29 IST 2005
boot: Wed Nov 28 20:59:15 2007
time: Thu Nov 29 08:13:50 2007
nbpg: 4096 bytes

----------------------------------------------------------------------
Physical memory usage summary (in page/byte/percent):

Physical memory = 1048064 4.0g 100%
Free memory = 13785 53.8m 1%
User processes = 364211 1.4g 35% details with -user
System = 666286 2.5g 64%
Kernel = 177654 694.0m 17% kernel text and data
Dynamic Arenas = 74176 289.8m 7% details with -arena
M_TEMP = 57616 225.1m 5%
M_SPINLOCK = 4328 16.9m 0%
M_SWAP = 2080 8.1m 0%
ALLOCB_MBLK_LM = 1467 5.7m 0%
M_VXVM = 1367 5.3m 0%
Other arenas = 7318 28.6m 1% details with -arena
Super page pool = 8 32.0k 0% details with -kas
Static Tables = 91602 357.8m 9% details with -static
nbuf = 46576 181.9m 4% bufcache headers
pfdat = 23921 93.4m 2%
htbl2_0 = 8192 32.0m 1%
pfn_to_virt = 3986 15.6m 0%
text = 2719 10.6m 0% vmunix text section
Other tables = 6206 24.2m 1% details with -static
Buffer cache = 488632 1.9g 47% details with -bufcache
UFC meta mrg = 0 0.0 0%
UFC file mrg = 0 0.0 0%

The buffer cache was using very high memory.

Please help me out.

Regards,
Nilesh

denn · November 29, 2007, 12:52pm

run swapinfo, and possible bdf -l

Cameron · November 29, 2007, 11:32pm

Suggestively, perform the following:

As user 'root'.
# grep EMS /var/adm/syslog/syslog.log > /tmp/EMS-rpt`date '+%y%m%d'`.txt

Will generate a text report file that will contain all EMS alerts on your system.
Within each EMS alert is instriction to acquire the detail of the EMS alert.

Might be worth while checking out ?

Cheers,
Cameron

prowla · November 30, 2007, 12:51am

It seems to me like you system is probably swapping, which will absolutely kill performance.

Your dynamic buffer cache has grown to 47% of memory (ie. half of your memory is being used to speed up file i/o, but it's strangling the system).

You can tune down the buffer cache bye setting the kernel parameters dbc_min_pct and dbc_max_pct to lower values (5 and 10 might be appropriate), but this requires a kernel rebuild and a reboot.

I don't know if there has been a hardware problem with memory; if there has then the EMS mentioned above should tell you:- check root's email and the syslog for EMS alerts.

Of course, you could also check if anything has been changed (I recall an Informix DBA "tuning" the database to use all the memory and not leaving enough for anything else to run!).