Extract data from log

I have logs in format

  ####<01-Mar-2015 03:48:18 o'clock GMT> <info>
 ####<01-Mar-2015 03:48:20 o'clock GMT> <info>
####<01-Mar-2015 03:48:30 o'clock GMT> <info>
####<01-Mar-2015 03:48:39 o'clock GMT> <info>

I am looking out for a script which can extract data of last 15 minutes from the last recorded data in the log file and then search a string in it and store the results in a file.
Additionally, I tried the below command but doesn't work for me

awk '$0>=$from' from=$(`date -u +"####<%d-%b-%Y %H:%M:%S o'clock GMT>" -d -5min`) test.log | grep -m1 -C5 'WORD'

Welcome to the forums. Have you tried something?

1 Like

I tried the below command but doesn't work for me

awk '$0>=$from' from=$(`date -u +"####<%d-%b-%Y %H:%M:%S o'clock GMT>" -d -15min`) test.log | grep -m1 -C5 'WORD'

in what way does in not work? Do you have incomplete output, incorrect output, no output or error messages?

What is your logic for doing this if you did it as a human? Humans recognise dates/times in records, but you might have to:-

  • Read a line
  • Get the date part from the line
  • Convert to seconds
  • Compare against current date (also in seconds)
  • Display the output if the difference is less than 900 seconds

It's not pretty I agree and I can see what you are trying to do, but I'm not sure that awk can do that (happy to be corrected though)

I suppose another way would be to generate all the possible matches for the date/time you need (only 900 of them) in a file and use:-

grep -f ref-file test.log

I notice from your input (now that I've put it in CODE tags for you) that there are leading spaces on some lines. That might be a little awkward, but not insurmountable.

Would either of these approaches help? If you can be more explicit in your needs, then maybe we can help a little more.

Regards,
Robin

The command you tried can't work for at least the following reasons (there may be others):

  1. when comparing dates, you need to compare year, month, and day (in that order); not day, month, and year,
  2. the month names Jan, Feb, ... do not sort into increasing date order,
  3. the dates in your log file are based on GMT, but the dates you are producing with the date command are based on the current process' TZ setting,
  4. some of your input lines have leading spaces,
  5. you have a command substitution inside a command substitution causing the from variable to be set to an empty string (and it should also generate a diagnostic similar to -ksh: ####<01-Mar-2015: not found that you didn't bother mentioning),
  6. the value you seem to be trying to store into the from variable is not a field number, and
  7. from what you have shown us, the string WORD does not appear anywhere in the input you are processing.

There is no need to convert the timestamps to seconds since the Epoch as long as you convert the date into a string that will sort correctly when doing a string comparison (i.e., 20150301 instead of 15-Jan-2015 ), and awk is certainly capable of converting the date portion of the format in your log file into the format above before comparing the results to the string produced by the command:

from="$(date +"###<%Y%m%d %H:%M:%S o'clock GMT>")"

.
If we assume that <info> is shorthand for some kind of information being logged in your log files, and if <info> NEVER contains a <newline> character, and if <info> in some of those line contain the string WORD , then the following might do what you want:

awk -v from=$(date -u -d '-15 min' +'%Y%m%d%H:%M:%S') -v debug="$debug" '
BEGIN {	m["Jan"] = "01"; m["Feb"] = "02"; m["Mar"] = "03"; m["Apr"] = "04"
	m["May"] = "05"; m["Jun"] = "06"; m["Jul"] = "07"; m["Aug"] = "09"
	m["Sep"] = "09"; m["Oct"] = "10"; m["Nov"] = "11"; m["Dec"] = "12"
}
split($1, mdy, /[-<]/) == 4 {
		mod_date = mdy[4] m[mdy[3]] mdy[2] $2
		if(debug) printf("mod_date=%s, from=%s, $1=%s, $2=%s\n",
			mod_date, from, $1, $2) > "debug.out"
}
mod_date >= from
' test.log | grep -m1 -C3 'WORD'

Note that this is untested because the date utility on my system does not have a -d option; AND, you didn't supply any sample data that would produce any output from your sample input.

The way this is coded, if the <info> data in your input files does contain multi-line data, it will copy the entire <info> field to the output if the 1st line meets the date requirements as long as no line in your input data contains exactly three < and or - characters in the first field of a line that is not in the specified format.

2 Likes

This is a duplicate of another thread you started: Error while extracting data from log file.

Creating two threads discussing the same topic confuses any volunteers who may be wasting their time trying to help you solve your problem.

This thread is closed.

2 Likes