sed/awk date range?

Hi,

I am trying to grep out a date range in an access log file. I defined the date like so;

DATE1=$(date --date '1 hour ago' '+%m/%d/%y:%H:%M:%S')
DATE2=$(date '+%m/%d/%y:%H:%M:%S')

Then I just used cat to get the hits to the url into a results.txt;

touch /tmp/results.txt
cat /var/log/httpd/access_log | grep index.php >> /tmp/results.txt

How would I use sed/awk to get the exact entries for the date ranges that were defined?

Thanks for any help.

Cheers!

see thread :

http://www.unix.com/shell-programming-scripting/169307-reading-lines-file-between-two-search-patterns.html\#post302565304

I saw that post as well, but when I try what is suggested, I just get an empty tmp.log, there should be at least a few lines.

Here is the script I wrote;

date1=$(date --date '1 hour ago' '+%m/%d/%y:%H:%M:%S')
date2=$(date '+%m/%d/%y:%H:%M:%S')

cat /var/log/httpd/access_log | grep index.php >> results.txt

awk -v d1="${date1}" -v d2="${date2}" '$0~d1{p=1} $0~d2{p=0} p' results.txt >> tmp.log

How does your results.txt look like ?

Give us a cat ... :slight_smile:

results.txt is just the grep'd access_log for apache on my proof of concept VM;

127.0.0.1 - - [17/Oct/2011:12:06:15 -0700] "GET /cacti/include/main.css HTTP/1.1" 304 - "http://localhost/cacti/index.php" "Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 Version/11.51"
127.0.0.1 - - [17/Oct/2011:12:06:15 -0700] "GET /cacti/images/favicon.ico HTTP/1.1" 304 - "http://localhost/cacti/index.php" "Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 Version/11.51"
127.0.0.1 - - [17/Oct/2011:12:06:15 -0700] "GET /cacti/include/layout.js HTTP/1.1" 304 - "http://localhost/cacti/index.php" "Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 Version/11.51"
127.0.0.1 - - [17/Oct/2011:12:06:15 -0700] "GET /cacti/images/shadow_gray.gif HTTP/1.1" 304 - "http://localhost/cacti/index.php" "Opera/9.80 (X11; Linux x86_64; U; en) Presto/2.9.168 Version/11.51"

From there I need to just get the entries over the last hour. So, do I have to use awk and filter everything but the numbers out, then use egrep to get the correct range and get the line count from that?

---------- Post updated at 02:52 PM ---------- Previous update was at 01:08 PM ----------

From what I have been reading, I would have to convert the date to be fully numeric, then sed would work nicely to get a range. Not sure how I can covert the log file, adjusting the httpd.conf logging format isn't an option.

Suggestions?

something to start with working on your 'grep-ed' file sample:
nawk -f epx.awk myGreppedLogFile
epx.awk:

BEGIN {
 FS="[[ ]"
 mon="JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC"
   monN=split(mon, monA, "|");
   for(i=1; i<=monN; i++) {
     monA[monA]=i;
     delete monA;
   }
}
{
    n=split($5,a, "[/:]")
    printf("%s ->[%s%02d%02d%s%s%s]\n", $5, a[3], monA[toupper(a[2])], a[1], a[4], a[5], a[6])
}

You don't need sed/grep - do it all natively in awk.

Than you very much. That converted the dates nicely. What do you suggest for getting the entries of the last hour? Current time, going back 60 minutes. I used date to mimic the format and going back 1 hour. I tried using sed but it returns 0.

sed -n '/$DATE1/,/$DATE2/p' output.log | wc -l

That look right?

@vgersh99

Clever code:

 
mon="JAN|FEB|MAR|APR|MAY|JUN|JUL|AUG|SEP|OCT|NOV|DEC"
   monN=split(mon, monA, "|");
   for(i=1; i<=monN; i++) {
     monA[monA]=i;
     delete monA;
   }
}

But can't help thinking that something like this is probably more readable, and not that much bigger (191 chars vrs 163 chars).

monA["JAN"]= 1; monA["FEB"]= 2; monA["MAR"] = 3
monA["APR"]= 4; monA["MAY"]= 5; monA["JUN"] = 6
monA["JUL"]= 7; monA["AUG"]= 8; monA["SEP"] = 9
monA["OCT"]=10; monA["NOV"]=11; monA["DEC"] =12