Hi,
sorry if there already a thread about this, I did a little bit of digging but haven't found exactly what I want.
I have a java application on a glassfish server who crash from time to time
I need a script to allert me if there's a error like "java heappspace" or "out of memory" in the server.log
for now I've done that :
#!/bin/sh
DIR=/opt/appli/glassfish.prod/domains/domain1/logs
MAIL=/usr/ucb/mail
DEST_MAIL=mail@me.com
SUBJ_MAIL="alert java heapspace blablabla"
/usr/xpg4/bin/grep -q 'Java heap space' $DIR/server.log
if [ $? -eq 0 ]
then echo "alerte java heapspace was found" | $MAIL -s "$SUBJ_MAIL" $DEST_MAIL
fi
it works fine but the thing is, the log roll is every 10Mo and I don't want to change that, and the script will continue to allerte me even if the error was taking care of (until the server.log is archive and a new one is created)
So I need to find a way to grep this type of error, but to detect if this error has already bin found
I really don't know how to start....if there's any idea?
I'm on a Soalris 10
thanks
create a file called /tmp/count and enter 0
DIR=/opt/appli/glassfish.prod/domains/domain1/logs
MAIL=/usr/ucb/mail
DEST_MAIL=mail@me.com
SUBJ_MAIL="alert java heapspace blablabla"
################added newly##############
err_count_file="/tmp/count"
error_count=`cat $err_count_file`
count=`/usr/xpg4/bin/grep -c 'Java heap space' $DIR/server.log`
if [ "$count" -gt "$error_count" ]
then
echo "alerte java heapspace was found" | $MAIL -s "$SUBJ_MAIL" $DEST_MAIL
echo $count > $err_count_file
else
exit 0
fi
well thanks, but doesn't work :-/ .... no email is sent....
but anyway, the condition is not satisfying ..... let's admit that I have an old java heap space in my server.log (who has already been treated), the grep will continue to count it, and the variable $count will still be greater than $error_count
maybe by just checking the log from the last XX minutes..... or something like that
check the "mail" if any errors in sending the mail.
Add one more echo line inside "if" and direct the output to a log file to debug further.
check /tmp/count file as well ( whether does it have a new value )
ok my bad, there was a quote missing ^^
so the script is working fine, but only until the log is archive and a new one is created
if a new log file is created, the old count stays in the "count" file .....and the script doesn't put 0 when it doesn't find anymore error ....
ok, I finally did what I want
I made some adjustments to the previous script so if there's a lot of java heap space I don't receive a lot of mail^^
for info, here it is :
#!/bin/sh
DIR=/opt/prod/glassfish/domains/domain1/logs
DIR_COUNT=/opt/prod/exploit/domain1/scripts/
MAIL=/usr/ucb/mail
DEST_MAIL=mails@me.com
SUBJ_MAIL="[ALERTE]Java heapspace was found"
START=0
err_count_file=$DIR_COUNT/count
error_count=`cat $DIR_COUNT/count`
/usr/xpg4/bin/grep -q 'Java heap space' $DIR/server.log
if [ $? -ne 0 ]
then echo $START > $DIR/count
fi
count=`grep -c 'Java heap space' $DIR/server.log`
if [ "$count" -gt "$error_count" ]
then
echo "Java heap space found " | $MAIL -s "$SUBJ_MAIL" $DEST_MAIL && echo $count > $err_count_file
else
echo $count > $err_count_file && exit 0
fi