socket in listen state disappears/closes automatically

Hi,

I am using solaris 10.

I have opened a socket connection[port] using java in solaris 10 operating system, the port went to LISTEN state and able to create new socket connection and the new connections went to ESTABLISHED state.

If I issue the command "netstat -an | grep <<portnumber>>", I am able to see the port in LISTEN state and the other ESTABLISHED connections. But suddenly after few days or weeks, the port in LISTEN state disappeared and I am not able to create new socket connection, but the existing ESTABLISHED connections exist.

I would like to know why the port in LISTEN state disappeared automatically. Also is there logs in Solaris where I can debug or check.

Is the server still running?

Yes. The server is running and other ports are in LISTEN state. Only one port which is there in LISTEN state disappeared(whereas already ESTABLISHED socket connections on this port still exist).

You meant ESTABLISHED, I suppose?

That's strange. Perhaps a netstat bug on Solaris?

HTH, Lo�c

seems like that is not a netstat bug. the port went out of LISTEN state and not able to create further connections on that port.

Where will the logs related to socket connections and LISTEN state go in solaris 10?

Solaris doesn't clean up LISTEN state ports, they can stay forever. You'd rather focus on the application logs.

Where does the socket related logs will be stored/logged in the Solaris?

We didn't change any code recently in this area in our application and didn't find any logs related to this problem in the application logs.

Port activity isn't logged, that would be overkill. You might have a look at the system logs (dmesg) for error events, but I doubt it is really system related. You should connect your application to a debugger and see what happens to your listening socket.

If the server is actually still listening, a pstack run against it should show a thread blocked in accept().

IME the most likely cause is a thread calling close() on the wrong file descriptor and killing the socket. You can run truss against the server and watch for that.

And are you checking the value that you get from accept() to make sure it's not an error? And when you do get an error, do you log it somewhere?

Because if the socket gets closed out from under accept(), it should return with an error.

Thanks. The thread is not available in the pstack dump. Unfortunately, we didn't log anything in the catch block(error condition case).

Is there any Solaris logs where I can get the reason about the port in LISTEN state which got killed or closed?

Why not starting by correcting this issue first ?

No.