Nagios is acting a little weird for me, I have this external script which I hooked into Nagios, it merely does a curl/wget on a URL and returns the status based on string in the content/output. Initially for 2-3 hrs the script returns the right status and Nagios reports correctly i.e. OK, WARN, ERROR based on the exit from the script. After 2-3 hrs output which was (and should be) OK or WARN starts returning CRITICAL and the output line says "Application is" and not even "Application is ERROR" or "Application is FATAL".
There is nothing in the logs to suggest what could be the problem. Have you experienced this before and let me know the corrective action. I am running Nagios on Mac OSX.
How often are you checking the service in question. i.e. the interval of execution?
Maybe the curl part is timing out and commands are getting queued up.
Also, your script is assumig that all possible output values of "curl -s $URL" contain the pattern "summary" (else $STATUS would become ""). Are you sure that's correct?
I switched to "check_http" plugin (official nagios plugin) but that also has same problem.
Initially when I start nagios then the status on the Nagios Web interface is same as the one returned from commandline.
After some time the status becomes critical but is not as same as the commandline. Command line returns the correct status of OK instead of what the Nagios web interface shows as "CRITICAL"
Result from commandline(& browser) -
./check_http -H xyz.com -p 2222 -u /abc -t 3
OK
Result from Nagios Interface -
nodename nor servname provided, or not known
HTTP CRITICAL - Unable to open TCP socket
I am thinking that this has something to do with Nagios as the box is behaving just fine based on the commandline result(and verified on the URL through the browser).