18-Mar-2012 14:25:03.209 general: error: socket: file descriptor exceeds limit (4096/4096)

I have BIND 9.8.1-P1 cache only DNS server running in Solaris 10. I have upgraded the same from 9.6.1 to 9.8.1-P1. Now i am facing "file descriptor exceeds limit (4096/4096)" error frequently on the server.

Please help me on this issue!

The per process limit for descriptors in your case is 4096.

Read the paragraphs below BEFORE you do this!

Edit /etc/system:

set rlim_fd_max  8192
set rlim_fd_cur  8192

Then reboot.

Since things worked well before, the likelihood of the open file descriptors limit being the root cause of your problem is small. So, changing it is like using a bandage to cure cancer. It may work for a while, but it is probably not fixing the problem.

I suspect that the root cause is the settings you maintain for /dev/tcp with the ndd command. Diagnosing the problems with these is somewhat touchy-feely in that you can make a change and see no noticeable effect or a big (postive or negative) change.

Usually, heavy duty sockets apps require ndd tuning. Here are some we have used on systems with loads of tcp/ip traffic. This does not mean they are a perfect choice. They are not. I would play around with these on a running system, check your values first. You may also want to google around for tuning tcp on solaris to see other values.

Also read your documentation to see if there are recommended settings. The parameters
name tcp_rexmit_* come to mind here. Some parms have to be set on clients as well.

These are the items we set, and the values we currently use:

/usr/sbin/ndd -set /dev/tcp tcp_keepalive_interval 900000
/usr/sbin/ndd -set /dev/tcp tcp_time_wait_interval 60000
/usr/sbin/ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500
/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q 65536
/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q0 65536
/usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 65536
/usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 65536
/usr/sbin/ndd -set /dev/tcp tcp_max_buf 655360