Nagios script output issue

jacki · August 27, 2010, 12:23pm

Hi Folks,

Nagios is acting a little weird for me, I have this external script which I hooked into Nagios, it merely does a curl/wget on a URL and returns the status based on string in the content/output. Initially for 2-3 hrs the script returns the right status and Nagios reports correctly i.e. OK, WARN, ERROR based on the exit from the script. After 2-3 hrs output which was (and should be) OK or WARN starts returning CRITICAL and the output line says "Application is" and not even "Application is ERROR" or "Application is FATAL".

There is nothing in the logs to suggest what could be the problem. Have you experienced this before and let me know the corrective action. I am running Nagios on Mac OSX.

Here is the script for the curious -

#!/bin/bash

read URL < "$1"

STATUS=`curl -s $URL |grep summary|awk -F\" '{print $2}'`
echo "Application is $STATUS"
echo "curl $URL"

case $STATUS in
OK)
   exit 0
   ;;
WARN)
  exit 1
  ;;
ERROR)
  exit 2
  ;;
FATAL)
  exit 2
  ;;
*)
  exit 2
  ;;
esac

Thanks,
Jack

---------- Post updated at 11:23 AM ---------- Previous update was at 12:27 AM ----------

When I run the script on the commandline it returns the correct status.

But on nagios it shows different error.

Any clues?

Thanks,
Jack

felipe.vinturin · August 27, 2010, 12:28pm

Maybe you are missing something!

I am not sure about how Nagios works, but try to create another script that calls this one and redirect its output to a file, like:

#!/bin/bash
# lets name the script above: nagiosTest.sh
nagiosTest.sh 1>> /<path to>/nagiosTest.log 2>> /<path to>/nagiosTest.log
retCode=$?
echo "nagiosTest.sh return code: [${retCode}]" 1>> /<path to>/nagiosTest.log 2>> /<path to>/nagiosTest.log
exit ${retCode}

verdepollo · August 27, 2010, 2:19pm

How often are you checking the service in question. i.e. the interval of execution?

Maybe the curl part is timing out and commands are getting queued up.

Also, your script is assumig that all possible output values of "curl -s $URL" contain the pattern "summary" (else $STATUS would become ""). Are you sure that's correct?

ygemici · August 27, 2010, 3:01pm

try change

read URL < "$1"

to

URL="$1"

jacki · August 29, 2010, 9:05pm

Folks,

I switched to "check_http" plugin (official nagios plugin) but that also has same problem.

Initially when I start nagios then the status on the Nagios Web interface is same as the one returned from commandline.

After some time the status becomes critical but is not as same as the commandline. Command line returns the correct status of OK instead of what the Nagios web interface shows as "CRITICAL"

Result from commandline(& browser) -
./check_http -H xyz.com -p 2222 -u /abc -t 3
OK

Result from Nagios Interface -
nodename nor servname provided, or not known
HTTP CRITICAL - Unable to open TCP socket

I am thinking that this has something to do with Nagios as the box is behaving just fine based on the commandline result(and verified on the URL through the browser).

Please help!

Thanks,
Jack.