Pend Low Mbuffs

bright_genius · December 20, 2005, 1:28am

Help.. my RS600 7043-140 keeps on hanging everyday.

This is the output of "errpt -a":

LABEL: PEND_LOW_MBUFS
IDENTIFIER: E9B4EB4B

Date/Time: Tue Jul 29 12:16:02
Sequence Number: 1183
Machine Id: 006066844C00
Node Id: mgmtoracle
Class: S
Type: PEND
Resource Name: SYSNET

Description
ALMOST OUT OF RESOURCES

Probable Causes
TRANSMIT/RECEIVE BUFFERS

Failure Causes
ALMOST OUT OF COMMUNICATIONS MEMORY BUFFERS (MBUFS)

    Recommended Actions 
    INCREASE THE NUMBER OF BUFFERS IN THE POOL

Detail Data
PERCENTAGE OF MBUFS CURRENTLY IN USE:
14

What will I do?

Thanks..

aixteam · December 23, 2005, 2:06am

what is Mbuf
The network subsystem uses a memory management facility that revolves around a data structure called an mbuf. Mbufs are mostly used to store data for incoming and outbound network traffic. Having mbuf pools of the right size can have a very positive effect on network performance. If the mbuf pools are configured improperly, both network and system performance can suffer. The AIX operating system offers the capability for run-time mbuf pool configuration. With this convenience comes the responsibility for knowing when the pools need adjusting and how much they should be adjusted.
The mbuf management facility controls two pools of buffers: a pool of small buffers (256 bytes each), which are simply called mbufs, and a pool of large buffers (4096 bytes each), which are usually called mbuf clusters or just clusters. The pools are created from system memory by making an allocation request to the Virtual Memory Manager (VMM). The pools consist of pinned pieces of virtual memory; this means that they always reside in physical memory and are never paged out. The result is that the real memory available for paging in application programs and data has been decreased by the amount that the mbuf pools have been increased. This is a nontrivial cost that must always be taken into account when considering an increase in the size of the mbuf pools.
The initial size of the mbuf pools is system-dependent. There is a minimum number of (small) mbufs and clusters allocated for each system, but these minimums are increased by an amount that depends on the specific system configuration. One factor affecting how much they are increased is the number of communications adapters in the system. The default pool sizes are initially configured to handle small- to medium-size network loads (network traffic of 100-500 packets/second). The pool sizes dynamically increase as network loads increase. The cluster pool shrinks if network loads decrease (the mbuf pool is never reduced). To optimize network performance, the administrator should balance mbuf pool sizes with network loads (packets/second). If the network load is particularly oriented towards UDP traffic (as it would be on an NFS server, for example) the size of the mbuf pool should be two times the packet/second rate

To change the maximum size of the mbuf pool to 3MB, enter:
no -o thewall=3072
2.To reset the maximum size of the mbuf pool to its default size, enter:
no -d thewall
To change the default socket buffer sizes on your system, add the following lines to the end of the /etc/rc.net file:
/usr/sbin/no -o tcp_sendspace=16384
/usr/sbin/no -o udp_recvspace=16384

The pool sizes dynamically increase as network loads increase. The cluster pool shrinks if network loads decrease ,if you dont have sufficient memory and your netwok load increase obviously the system will hang ,so in your case to solve the problem you should increase physical memory size or increase the mbuf pool size (compare this with the recommended action in your errpt command ,

for more information u can use the command,By using the netstat command you can get a rough idea of the network load in packets/second.
netstat -I tr0 5
It reports the input and output traffic both for the tr0 adapter and for all LAN adapters on the system. The output below shows the activity caused by a large ftp command operation:

$ netstat -I tr0 2
input (tr0) output input (Total) output
packets errs packets errs colls packets errs packets errs colls
20615 227 3345 0 0 20905 227 3635 0 0
17 0 1 0 0 17 0 1 0 0
174 0 320 0 0 174 0 320 0 0
248 0 443 0 0 248 0 443 0 0
210 0 404 0 0 210 0 404 0 0
239 0 461 0 0 239 0 461 0 0
253 1 454 0 0 253 1 454 0 0
246 0 467 0 0 246 0 467 0 0
99 1 145 0 0 99 1 145 0 0
13 0 1 0 0 13 0 1 0 0
The netstat command also has a flag, -m, that gives detailed information about the use and availability of the mbufs and clusters:

253 mbufs in use:
50 mbufs allocated to data
1 mbufs allocated to packet headers
76 mbufs allocated to socket structures
100 mbufs allocated to protocol control blocks
10 mbufs allocated to routing table entries
14 mbufs allocated to socket names and addresses
2 mbufs allocated to interface addresses
16/64 mapped pages in use
319 Kbytes allocated to network (39% in use)
0 requests for memory denied
0 requests for memory delayed
0 calls to protocol drain routines
The line 16/64 mapped pages in use indicates that there are 64 pinned clusters, of which 16 are currently in use.

This report can be compared to the existing system parameters by issuing a no -a command. The following lines from the report are of interest:

lowclust = 29
lowmbuf = 88
thewall = 2048
mb_cl_hiwat = 58
It is clear that on the test system, the 319 Kbytes allocated to network is considerably short of thewall value of 2048KB and the (64 - 16 = 48) free clusters are short of the mb_cl_hiwat limit of 58.

Dare AIXteam