retrieve lines from file which fall under the given date range

Hi,

I need to retrieve the lines which fall under the given date range.
eg:In a log file,i have the lines which will have the timestamp.
the input will be some date range.eg: from date:03/Jan/2008,to date:24/Jul/2008.so now i want to retrieve the lines
which have the timestamp between these 2 given date range.

log file:

[02/Jan/2008:19:37:00-20401-59-2] Process - data
[22/Jan/2008:19:37:00-20401-59-2] Process - data
[22/Mar/2008:19:37:00-20401-63-2] Process - data
[01/Jul/2008:19:37:00-20401-63-2] Process - data
[22/Jul/2008:19:37:00-20401-63-2] Process - data
[25/Jul/2008:19:37:00-20401-63-2] Process - data

result:
Lines 2,3,4 and 5 have to be retrieved. the dates are within the given input date range.

A possible solution using awk :

awk -v From="03/Jan/2008" -v To="24/Jul/2008" '
function cnvDate(date   ,d) {
   split(tolower(date), d, "/");
   return sprintf("%04.4d%02.2d%02.2d", d[3], month[d[2]], d[1]);
}
BEGIN {
   FS = "[:[]";
   month["jan"]=1 ; month["feb"]=2 ; month["mar"]=3 ; month["apr"]=4 ;
   month["may"]=5 ; month["jun"]=6 ; month["jul"]=7 ; month["aug"]=8 ;
   month["sep"]=9 ; month["oct"]=10; month["nov"]=11; month["dec"]=12;
   date_from = cnvDate(From);
   date_to   = cnvDate(To);
}
{
   date = cnvDate($2)
   if (date >= date_from && date <= date_to)
      print;
}
' inputfile

Jean-Pierre

I am getting the error when i run this.
awk: syntax error near line 1
awk: bailing out near line 1

As i am new to awk,could you pls explain what exactly its doing.and i want to redirect the results to a new file.

Try with nawk or gawk instead of awk.

awk -v From="03/Jan/2008" -v To="24/Jul/2008" 

Defines variables From and To which contain start and end dates.

function cnvDate(date   ,d) {
   split(tolower(date), d, "/");
   return sprintf("%04.4d%02.2d%02.2d", d[3], month[d[2]], d[1]);
}

This function coverts a date from 'dd/mmm/yyyy' to 'yyyymmdd'.

BEGIN {
   FS = "[:[]";
   month["jan"]=1 ; month["feb"]=2 ; month["mar"]=3 ; month["apr"]=4 ;
   month["may"]=5 ; month["jun"]=6 ; month["jul"]=7 ; month["aug"]=8 ;
   month["sep"]=9 ; month["oct"]=10; month["nov"]=11; month["dec"]=12;
   date_from = cnvDate(From);
   date_to   = cnvDate(To);
}

Initalizations :

  • Input field separator ':' or '['
  • Months table used by cnvDate function
  • Start and end dates format yyyymmdd
{
   date = cnvDate($2)
   if (date >= date_from && date <= date_to)
      print;
}

For each input line :

  • Convert date to format yyyymmdd
  • Print line if date between start and end dates

Jean-Pierre.

For the same query,if the input file is like this(below),I tried getting the lines by using the field separator (FS) as blank space.
I used the code like this..
BEGIN{
FS = "[ ]";
But its not working.How I can specify that it has to take the 7th field with the delimiter single space.or is there any other way.
Input file:
-----------
2008-01-02 16:21:35,182 INFO1 loginslogging - mk99263 02/Jan/2008 16:21 2008-01-22 16:21:35,182 INFO2 loginslogging - mk99263 22/Jan/2008 16:21 2008-03-22 16:21:35,182 INFO3 loginslogging - mk99263 22/Mar/2008 16:21 2008-07-01 16:21:35,182 INFO4 loginslogging - mk99263 01/Jul/2008 16:21 2008-07-22 16:21:35,182 INFO5 loginslogging - mk99263 22/Jul/2008 16:21
2008-07-25 16:21:35,182 INFO6 loginslogging - mk99263 25/Jul/2008 16:21

Field separator = space (or tab)
Date field = $7

awk -v From="03/Jan/2008" -v To="24/Jul/2008" '
function cnvDate(date   ,d) {
   split(tolower(date), d, "/");
   return sprintf("%04.4d%02.2d%02.2d", d[3], month[d[2]], d[1]);
}
BEGIN {
   month["jan"]=1 ; month["feb"]=2 ; month["mar"]=3 ; month["apr"]=4 ;
   month["may"]=5 ; month["jun"]=6 ; month["jul"]=7 ; month["aug"]=8 ;
   month["sep"]=9 ; month["oct"]=10; month["nov"]=11; month["dec"]=12;
   date_from = cnvDate(From);
   date_to   = cnvDate(To);
}
{
   date = cnvDate($7)
   if (date >= date_from && date <= date_to)
      print;
}
' inputfile