SIGHUP killing Oracle Listener Process

jimthompson · February 8, 2012, 7:01am

I have a cold backup script which backs up my database and then restarts the oracle listener and database at around 01:30
I can see at this time that my database and listener are indeed running. However at around 02:17 my listener process receives a SIGHUP 1 signal from the AIX OS ( version 5.3 ) that kills the listener process ( the background instance processes of the database remain uneffected, it is only the listener process that gets killed ). To compensate for this, I restart the listener via cron at 04:30. It starts fine and remains up

I have a truss log of the database listener process when it gets killed and it shows the following at the end of the log file

getsockopt(33, 65535, 4104, 0x00000001105985F4, 0x00000001105985F0) = 0
connext(33, 0x00000001105F85E8, 16) Err#79 ECONNREFUSED
shutdown(33, 2) Err#76 ENOTCONN
close(33) = 0
_nsleep(0x00000001105985E0, 0x00000001105986B0) Err#4 EINTR
Received signal #1, SIGHUP [default]
*** process killed ***

I have looked at the AIX errors being reported -

Error 79 in AIX is - connection refused
Error 76 in AIX is - socket is not connected
Error 4 in AIX is - interrupted system call

The unsual thing is that the script I use at the end of the backup to start the listener, is the exact same script I use in the 04:30 cron to restart it. Also the backup script has not changed in months, yet this problem has only started recently.

I am not sure why leaving the listener up means it does not get effected yet if I start it at the end of the cold backup, it can only seem to survive for about 45 mins before it gets killed. The AIX errors appear to be socket related but I am not sure why then that only effects the listener process if it is newly started rather than if it has been up all day

any ideas of finding out what is killing my Oracle Listener process and why only under quite specific circumatances ( i.e. only when started in the early hours of the morning )

Jim

cero · February 8, 2012, 8:21am

Did you check if there are any hints in the listener log (usually located in $ORACLE_HOME/network/log)? Did the instance register after the cold backup?

jimthompson · February 8, 2012, 11:53am

there were no errors in the listener log at all ( it doesn't get a chance because the tnslsnr process simply gets killed ). Also checked the database alert log and there is nothing in there either. The killer error message appears to be the contents of the truss log that I did on the tnslsnr process itself. Oracle support have reviwed this and believe it is definitely an AIX issue.

This listener is explicitly defined via a listener.ora as opposed to automatically registering via the pmon process of the Oracle instance.