Pattern count on rotating logs for the past 1 Hr

Gem_In_I · April 27, 2012, 1:44am

Hi All,

I have a requirement to write a shell script to search the logs in past 1 hour and extract some pattern from it and count it cumulatively to a file.

The problem which I'm facing here is - logs rotates on size basis, say if size of log reaches 5 MB then new log will be generated and all the entries will be written to this new log. Due to this log can rotate in 10 or 15 or 20 minutes.

So, in 1 hour there can be 2 or 3 or 4 logs that could have generated. So, I have to search the pattern on 2,3 or 4 logs as it may contain data for past 1 hr/ 1hr 10 min and so on...

How do I search the pattern occurance in past 1 hr from these logs?

Thanks in advance!!!

jim_mcnamara · April 27, 2012, 8:01am

Each line in the logs has a timestamp?

Otherwise you cannot be precise about where in a log file to start your search. Example: Suppose a log file has a ctime of 65 minutes ago (ctime is as close as you can get to a create time for files in UNIX). The mtime (last time of write) is 45 minutes ago. So where in the file is 5 minutes into the log? You have to guess without some other guidance.

Please show sample log file entries and an example pattern. What OS and shell (bash, ksh, etc.)?

ThomasMcA · April 27, 2012, 9:03am

Because each log entry has a timestamp, exactly which files would need to be processed doesn't matter. Just loop through all of the logs, and check the timestamp of each record. If you need to minimize processing time, the top of your loop could check the first record in the file, and skip to the next file if that first record is more than X hours old.

Gem_In_I · April 27, 2012, 2:22pm

Hi Jim / ThomasMcA,

Log file has time-stamp at the beginning of each line as "DD/MM/YY HH:MM:SS:SSS"
(SSS in last is the milliseconds)

And, OS - Solaris, shell - ksh

I'll have to set a cron which will run this script to check all the logs for the last hour and print that count in separate log file, where the count will get add cumulatively for each hour for the day.

What I think is if I take the system time and get the hour stored in a variable.
and then pass this hour variable in grep string to check the logs for 1 hour back data in logs.

for eg.
Suppose cron ran at 04:00, then script has to take the data for the time stamp 03-03:59 from all the logs(say to check from 15 logs in directory)
So,
HOUR=`date` (stored "27/04/12 04" in HOUR.. syntax not correct though )
Now,
I have to search logs for "27/04/12 03"
I stored "27/04/12 03" after calculation(modifying it from 04 to 03, not sure how to do this at the moment) in another variable (NEWHOUR)

for i in `ls`
do
  grep $NEWHOUR $i | grep pattern | wc  -l
done

Will that work??

ThomasMcA · April 30, 2012, 11:14am

Try it. If that doesn't work, debug the process. If that still doesn't work, google any error messages. If that still doesn't work, come back here and explain what happened.

The above process helps you to learn. People on forums are much more willing to help you after you've tried to help yourself first.

PS: this isn't any type of flame or slam - I'm just explaining the process

Gem_In_I · April 30, 2012, 2:40pm

Hi ThomasMCA,

I'll definitely give a try .... but if you can tell me how to find files modified in last 1 hour, as I could not find any "find" command or any other way I can get the list of files.
Thanks...

ThomasMcA · April 30, 2012, 4:51pm

This command which find tells you if the find command is installed.

If it is installed, man find tells you how to use it.

Yeaboem · April 30, 2012, 5:41pm

another potential clue..

in the beginning of your script, use find with the "-newer filename" option to generate a list of files to process, and at the end of your script, "touch filename" to reset the timestamp for the next execution (presumably in 1 hour).

this will allow you to limit the number of files you must iterate through to only those which have been modified since the last execution cycle.

As for determining the correct value of HOUR, you have many different ways to get that.. use the "date" command with some arithmetic, use "ls -l filename" and parse the output for a value, or perhaps store a useful value in the timestamp "filename" and extract the value with "cat" are just a few ways.