Pulling information from a data file by date

SkySmart · January 16, 2015, 1:30pm

awk -v now="$(date +%s)" -v tDiff="${USERMINUTES}" '
   BEGIN {
      FS="="
      if (!now) now=systime()
      if (!tDiff) tDiff=60*60
      p=1
  }
   /{/ {rec=$0;p=1;next}
   /}/ && rec && p {print rec ORS $0;next}
   $1=="entry_time" { if (now-$2>tDiff)p=0 }
   {rec=rec ORS $0}' "${1}"

the below code is very fast. it was built for something else but i'd like to be able to tweak it to do what i want to do.

what i need to do is a read a system log file which is about 40MB huge. i was to pull out the last 10 minutes worth of a information from the log.

my problem is, for a log file that big, records may be in there which may be a year or more old.

for instance, if i wanted to grab the last 10 minutes from a log. a variation of the following command can be used:

awk '/Jan 16 10:20/,0' /var/log/mail.log

however, if the log is a year old. then this awk statement will grab the very first occurence of "Jan 16 10:20", which may be a year ago, as opposed to 10 minutes ago.

any help will be much appreciated.

Don_Cragun · January 16, 2015, 2:40pm

Is the year included in the timestamp in the log file you're searching?

If not, it may be difficult to determine which set of lines you want.

SkySmart · January 16, 2015, 2:50pm

unfortunately, the year is not included. and i agree, that makes it difficult.

ernie · January 16, 2015, 3:14pm

Why not test the first line and only continue to search the log if date & time is within the last 10 minutes?

RudiC · January 16, 2015, 3:27pm

I admit that it may be lengthy with a huge file, but why not tac it until Jan 16 is found? Btw, tac is smart and uses seek_set to read the records in chunks from the rear. If killed by e.g. a broken pipe due to the receiver quits, the rest of blocks is not read.

SkySmart · January 20, 2015, 1:17pm

tac is only available on linux. and i need something that will work on any unix host.

so the code below seems to work:

awk '{a[NR]=$0} END {while (NR) print a[NR--]}'  log.data

the only thing is, it reads the entire file backwards. if the file is huge, you can imagine it will take a long time to complete.

so what i want to do is, i want this code to be modified so that it aborts immediately after it finds the date and time matching the time specified by the user.

so if i wanted to scan the last 3 days of a log. that would be Jan 17. so when this awk code is run, and it is reading the file backwards, once it finds a date that starts with "Jan 17", it aborts and does not continue scanning back any further.

can anyone please help me modify this code to do that?

RudiC · January 20, 2015, 1:54pm

How about using dd skip=N to read some blocks at the end of file?