How to have my ksh script pause, until something appears in the logs.?

chatguy · April 11, 2013, 10:57am

All,

Is there some way to have my KSH script pause, until the word "DONE!" appears in the logfile /logs/log.txt?

Difficulty: The word "DONE!" may already be present in logs from the past. So I anticipate tail -0f may (somehow?) need to be done in the background, instead of grepping in a loop?

Thanks so much,
CG

biker · April 11, 2013, 12:09pm

would it work to have a counter for the string "DONE!" that is initialized to 0 at the start of your script and then enter a loop containing a sleep statement and then after the sleep statement expires count the number of "DONE!" instances to see if it has increased from previous count.

if it has not increased, loop back for another nap (sleep)

if it has increased, run a conditional code path that will increment your DONE counter by 1 and run anything else before it's time to sleep again?

biker
LiveFire Labs: Online UNIX Training with Hands-on Internet Lab
Online UNIX and Linux Training with Hands-on Internet Lab

chatguy · April 11, 2013, 12:30pm

Good idea, but unfortunately I can't use a sleep/counter anymore

I am actually currently using a sleep/counter now, but I need to now change it to something that monitors the logs.
...reason being is that the background process that generates this logfile has changed, and the DONE! can happen anywhere from 40 seconds to now 7 minutes.

Although I can technically sleep 7 minutes, because I run this script over and over, it'd be unfortunately way too time consuming if the "DONE!" actually occurs in 50 seconds, for example. -- other things do show in this logfile too, so I can't check for a change in the logfile size either.

hanson44 · April 11, 2013, 12:45pm

What if you just sleep 10 seconds or other short interval, look at the log file, sleep 10 seconds, etc.?

biker · April 11, 2013, 12:59pm

i'm not sure i follow, but i'll probably need to take some time to think about your response in order to ask some relevant follow-up questions. one thing i don't understand is why the script is run "over and over" vs starting it once and letting it run.

in the meantime, have you looked at or considered co-processes/co-routines?

if you're not familiar, here are a few older pages related to the topic...

Co-processes - Part I - Review of Foreground and Background Jobs

Co-processes - Part II - Introduction

Co-processes - Part III - Syntax and Simple Example

biker

busyboy · April 11, 2013, 1:04pm

I simply suppose you want something like

tail --retry -f log | while read line
do
grep -q "LOG" 
counter++
done.

or

you may try a simple combination of "tail -f logfile|egrep LOG" and then check the count of output from the command

chatguy · April 11, 2013, 3:15pm

I suppose I should elaborate:

What I'm modifying is a KSH automation script, that's run on demand, for testing a piece of software.

What this KSH script does/need to do: (Step 2)

1) Shutdown, prepare, and restart the software under test (already scripted)

2) Wait for a message (we'll call "DONE!") to appear in /log.txt that says that the MYSQL DB has bound to the RMI/Software.
a. This can take anywhere from 40 seconds to 7+ minutes, depending on DB size..
b. While I can put a 7+ minute delay there, it would be much better if this KSH script continued "as soon as" the "DONE!" message appears in the /log.txt. *This is the challenge.
c. I cannot simply grep the file for "DONE!" in a while loop, because I cannot delete "old/past" logs in this same file that may already contain this word.
d. I also cannot simply check in a while loop for the filesize to change, since other logs are being appended to this same log file during this wait time.

3) Continuing on with already automated tests...

Apologies I wasn't as detailed the first time around!
Thanks so much
CG

hanson44 · April 11, 2013, 3:20pm

Would this work:

The first time, check if "DONE" already exists in the log file, using grep -n and tail -n 1 to find the last line number that "DONE" occurs on.

Then use the while loop, sleeping every ten seconds, but tail the log file each time you look, instead of looking at entire log file. That way, you ignore the previous "DONE" lines at the top of the log file that were already there,

alister · April 11, 2013, 4:17pm

Are there no timestamps in this log file?

Or do as you suggested and tail the file. Pipe the stream into code that exits when it matches (grep -q). Worst case, tail will linger until the next time it tries to write (when SIGPIPE will kill it).

Regards,
Alister

biker · April 11, 2013, 9:41pm

is it necessary to key off of the "DONE!" or could you somehow have the script check the status of the software under test? maybe running some sort of pre-existing status command or check the system's process list if the readiness of the SUT can be determined that way. (the check could be in a loop that would terminate as soon as it pro-actively determined the "MYSQL DB has bound to the RMI/Software" - if that is possible w/o having to monitor a log for status)
just trying to think of other possible options that might be cleaner for you...
biker

chatguy · April 12, 2013, 9:36am

Hi Hanson:

Hm, I think you're on to something. I'll admit, I'm not sure how to "tail in the background" for every 10-seconds, if that is what you mean. (As you can tell, I'm probably somewhere between a novice and intermediate KSH scripter)

But your grep -n idea got me thinking:

Perhaps one way of scripting this is: (?)

1) Shutdown software ("cControl stop")
2) grep -n for DONE! (using awk?, record the "latest" line number that DONE! was found on, or record "0" if not found.)
3) Startup software ("cControl start")
4) In a while loop, every 10 seconds, grep -n for DONE!
4b) Break while loop (somehow?) if grep -n finds a "LATER" line number than what was recorded in step 2 above. <-- Hm, not sure how to do this...but it sounds possible?

Hi Biker:

Very good point that I wish I would have thought of in the design phase! Unfortuanately it's too late to add to the design (as much as it would make things a lot easier), the only way it could be determined though, is with the logs

Hi Alister:

Yes! There actually are timestamps. This was my first chain of thought, but as I work on this system late (which is also 5 hours ahead in GMT), it's likely I will hit the 00:00:00 midnight hour mid-way through...so I'm not quite sure how to attack it from that angle. Here's a snippet of those logs:

Apr 11 19:19:39 localhost APP.ph1[16435]: [local4.info] INFO creating Location
Apr 11 19:22:31 localhost APP.ph1[16435]: [local4.info] INFO creating LocDef
Apr 11 19:23:16 localhost APP.ph1[16435]: [local4.info] INFO DONE!
Apr 11 19:30:00 localhost APL.ph0[3102]: [local4.info] INFO APL Heartbeat 2106
Apr 11 19:30:00 localhost APP.ph0[3105]: [local4.info] INFO APP Heartbeat 4074
Apr 11 19:34:40 localhost APP.ph1[16435]: [local4.info] INFO New thread pool of size:3
Apr 11 19:34:41 localhost APP.ph1[16435]: [local4.info] INFO Starting Iteration 1
Apr 11 19:34:43 localhost APP.ph1[16435]: [local4.info] INFO Iteration 1, all tasks finished
Apr 11 19:34:43 localhost APP.ph1[16435]: [local4.info] INFO Running command:/APP/createTar.sh ph1 
Apr 11 19:37:54 localhost APP.ph1[16435]: [local4.info] INFO creating Server
Apr 11 19:41:49 localhost APP.ph1[16435]: [local4.info] INFO creating Site
Apr 11 19:42:05 localhost APP.ph1[16435]: [local4.info] INFO creating Location
Apr 11 19:42:22 localhost APP.ph1[16435]: [local4.info] INFO creating LocDef
Apr 11 19:44:23 localhost APP.ph1[16435]: [local4.info] INFO DONE!

If you know of any way of attacking it from that angle, that would certainly help as well!

ps: Thank you all for your help so far! Definitely appreciated!

CG

biker · April 12, 2013, 12:32pm

one thing i did for a tool i wrote a year or two back was to use the epoch value (seconds since 1970) for date/timestamp comparisons if that's the concern

biker
LiveFire Labs: Online UNIX Training with Hands-on Internet Lab

hanson44 · April 12, 2013, 1:26pm

Yes, this is exactly what I'm suggesting. Here is working code, using sed as a better way to get the part of log file to test:

LAST_LINE_WITH_DONE=`grep -n DONE log | tail -n 1 | cut -d : -f 1`
if [ -z "$LAST_LINE_WITH_DONE" ]; then
  LINE_1_TO_TEST=`wc -l < log`
else
  LINE_1_TO_TEST=`expr $LAST_LINE_WITH_DONE + 1`
fi

while sleep 10; do
  sed -n "$LINE_1_TO_TEST,$ p" log > lines-to-test.txt
  grep -q DONE lines-to-test.txt
  if [ $? -eq 0 ]; then
    echo "Found DONE in newly created section of log"
    # Take additional actions
    # Perhaps break out of look
  fi
done

chatguy · April 12, 2013, 4:09pm

Hansen,

You are a genius! At the same time, I never thought it was this easy to do this! Thank you and everyone who chimed in!!

ps: If the word DONE was not previously in the file, the variable is NULL. Turns out this is just fine, since I can compare null with 1, if the first DONE is added.

Here is the modified code I used:

LOGFILE="/tmp/tmp.log"

LAST_LINE_WITH_DONE=`grep -n DONE $LOGFILE | tail -1 | cut -d : -f 1`

print INITIAL LAST LINE IS $LAST_LINE_WITH_DONE

while sleep 5; do

  NEWLAST_LINE_WITH_DONE=`grep -n DONE $LOGFILE | tail -1 | cut -d : -f 1`
   if [[ $LAST_LINE_WITH_DONE != $NEWLAST_LINE_WITH_DONE ]]; then
     print NEW LAST LINE IS $NEWLAST_LINE_WITH_DONE
     break;
   fi
done

RudiC · April 12, 2013, 6:03pm

I don't know how large your log files will be, but grep ing or sed ing them will always read them from the start, which may become serious overhead.
If you just want to know if there's a DONE occurred in your log file ever since your last visit, why don't you keep the recent file size in a variable:

$ skip=$(stat -c%s file)

, and then, n seconds later, check the lines added ever since with

$ dd if=file ibs=1  skip=$skip | grep DONE

. You'll always have exactly the last chunk of data added with no overhead!
See the time differences for a 200MB log file::

$ time grep -n DONE file |tail -1
7040081:DONE

real    0m0.653s
user    0m0.516s
sys    0m0.132s
$ time dd if=file ibs=1 skip=$skip |grep DONE
DONE

real    0m0.005s
user    0m0.000s
sys    0m0.004s

dd is standard for every *nix implementation while stat may not be; then ls and awk will be your friends...

chatguy · April 15, 2013, 2:19pm

Hi RudiC-

Wow, I was really excited to see this idea too since I'm a big fan of efficiency.

Unfortunately, as you mentioned may be the case, "stat" isn't available in my environment (Solaris).

I'm not as familiar with dd (it appears I need "stat" beforehand to make it use of "dd" ? ).

If that's true, perhaps I can combine your method with Hanson's way of getting the (original) last line with done:

LAST_LINE_WITH_DONE=`grep -n DONE log | tail -n 1 | cut -d : -f 1`

....then, perhaps there's some way of modifying the below line:

NEWLAST_LINE_WITH_DONE=`grep -n DONE $LOGFILE | tail -1 | cut -d : -f 1`

...so that it can start searching the file after line number $LAST_LINE_WITH_DONE ?

In the worst case, this is more of a fun challenge to make things more efficient, since Hanson's idea (ps: thanks again!) has been working very well too.

Thanks again everyone!
CG

RudiC · April 15, 2013, 2:47pm

As I stated in my post, you don't NEED stat ; it's just more efficient than this method to get the files sizes in bytes:

$ ls -l file newaa
-rw-rw-r-- 1 xxxxx xxxxx 1300 Apr 14 22:26 file
-rw-rw-r-- 1 xxxxx xxxxx  111 Apr  7 22:37 newaa
$ ls -l file newaa | awk '{print $5}'
1300
111
$ stat -c%s file newaa
1300
111

The result ist the same, as you can see. dd just has to skip last time's file sizes' count of bytes to get at the new lines added.
I could imagine that Solaris has a stat equivalent command, btw, and you just need to search a bit...

And, the two methods would be difficult to combine, as dd needs a number of bytes or blocks to skip, and grep identifies lines that can be of varying length. Would sort of work if one could assume an average line length, but this were far from being exact.