Pulling information from a data file by date

awk -v now="$(date +%s)" -v tDiff="${USERMINUTES}" '
   BEGIN {
      FS="="
      if (!now) now=systime()
      if (!tDiff) tDiff=60*60
      p=1
  }
   /{/ {rec=$0;p=1;next}
   /}/ && rec && p {print rec ORS $0;next}
   $1=="entry_time" { if (now-$2>tDiff)p=0 }
   {rec=rec ORS $0}' "${1}"

the below code is very fast. it was built for something else but i'd like to be able to tweak it to do what i want to do.

what i need to do is a read a system log file which is about 40MB huge. i was to pull out the last 10 minutes worth of a information from the log.

my problem is, for a log file that big, records may be in there which may be a year or more old.

for instance, if i wanted to grab the last 10 minutes from a log. a variation of the following command can be used:

awk '/Jan 16 10:20/,0' /var/log/mail.log

however, if the log is a year old. then this awk statement will grab the very first occurence of "Jan 16 10:20", which may be a year ago, as opposed to 10 minutes ago.

any help will be much appreciated.

Is the year included in the timestamp in the log file you're searching?

If not, it may be difficult to determine which set of lines you want.

unfortunately, the year is not included. and i agree, that makes it difficult.

Why not test the first line and only continue to search the log if date & time is within the last 10 minutes?

I admit that it may be lengthy with a huge file, but why not tac it until Jan 16 is found? Btw, tac is smart and uses seek_set to read the records in chunks from the rear. If killed by e.g. a broken pipe due to the receiver quits, the rest of blocks is not read.

1 Like

tac is only available on linux. and i need something that will work on any unix host.

so the code below seems to work:

awk '{a[NR]=$0} END {while (NR) print a[NR--]}'  log.data

the only thing is, it reads the entire file backwards. if the file is huge, you can imagine it will take a long time to complete.

so what i want to do is, i want this code to be modified so that it aborts immediately after it finds the date and time matching the time specified by the user.

so if i wanted to scan the last 3 days of a log. that would be Jan 17. so when this awk code is run, and it is reading the file backwards, once it finds a date that starts with "Jan 17", it aborts and does not continue scanning back any further.

can anyone please help me modify this code to do that?

How about using dd skip=N to read some blocks at the end of file?