I have a cold backup script which backs up my database and then restarts the oracle listener and database at around 01:30
I can see at this time that my database and listener are indeed running. However at around 02:17 my listener process receives a SIGHUP 1 signal from the AIX OS ( version 5.3 ) that kills the listener process ( the background instance processes of the database remain uneffected, it is only the listener process that gets killed ). To compensate for this, I restart the listener via cron at 04:30. It starts fine and remains up
I have a truss log of the database listener process when it gets killed and it shows the following at the end of the log file
getsockopt(33, 65535, 4104, 0x00000001105985F4, 0x00000001105985F0) = 0
connext(33, 0x00000001105F85E8, 16) Err#79 ECONNREFUSED
shutdown(33, 2) Err#76 ENOTCONN
close(33) = 0
_nsleep(0x00000001105985E0, 0x00000001105986B0) Err#4 EINTR
Received signal #1, SIGHUP [default]
*** process killed ***
I have looked at the AIX errors being reported -
Error 79 in AIX is - connection refused
Error 76 in AIX is - socket is not connected
Error 4 in AIX is - interrupted system call
The unsual thing is that the script I use at the end of the backup to start the listener, is the exact same script I use in the 04:30 cron to restart it. Also the backup script has not changed in months, yet this problem has only started recently.
I am not sure why leaving the listener up means it does not get effected yet if I start it at the end of the cold backup, it can only seem to survive for about 45 mins before it gets killed. The AIX errors appear to be socket related but I am not sure why then that only effects the listener process if it is newly started rather than if it has been up all day
any ideas of finding out what is killing my Oracle Listener process and why only under quite specific circumatances ( i.e. only when started in the early hours of the morning )
Jim