To print lines between 2 timestamps using awk|sed and regex

sarah-alikhan31 · April 7, 2013, 4:55pm

Hi,

I am using the following code to fetch lines that are generated in last 1 hr . Hence, I am using date function to calculate -last 1 hr & the current hr and then somehow use awk (or sed-if someone could guide me better)
with some regex pattern.

dt_1=`date +%h" "%d", "%Y\ %l -d  "1 hour ago"`  #Apr 05, 2013 1   --- if ran at 02 AM
dt_2=`date +%p -d  "1 hour ago"`  # AM|PM
echo $dt_1
echo $dt_2
dt_3=`date +%h" "%d", "%Y\ %l`
dt_4=`date +%p`
echo $dt_3
echo $dt_4
awk  -v a="{$dt_1}" -v b="{$dt_2}" -v c="{$dt_3}" -v d="{$dt_4}" '$0~/a:[0-9][0-9]:[0-9][0-9] b/{flag=1;next} $0~/c:[0-9][0-9]:[0-9][0-9] d/{flag=0} flag' /some/logfile >> /some/dump_logs.txt

The above code doesnot give any errors and I have also tried without $0~

Let's say I want all the lines starting from
Apr 07, 2013 1:mm:ss AM
to
Apr 07, 2013 2:mm:ss AM

if script gets executed b/w 2:00-2:59 hrs

OR
lines between
Apr 07, 2013 12:mm:ss PM
to
Apr 07, 2013 1:mm:ss PM

if script gets executed b/w 13:00-13:59 hrs

I have been searching for similar posts reported by various users, but I can only see variables being used in awk, without any regex pattern.

Could anyone please guide me here ?

vgersh99 · April 7, 2013, 5:02pm

change

$0~/a:[0-9][0-9]:[0-9][0-9] b/

to

$0~ (a ":[0-9][0-9]:[0-9][0-9] "b)

(and the same of the others)

sarah-alikhan31 · April 7, 2013, 5:39pm

awk  -v a="{$dt_1}" -v b="{$dt_2}" -v c="{$dt_3}" -v d="{$dt_4}" '$0~ (a ":[0-9][0-9]:[0-9][0-9] "b){flag=1;next} $0~ (c ":[0-9][0-9]:[0-9][0-9] "d){flag=0} flag' /some/logfile >> /some/dump_logs.txt

I used the above snippet, as suggested...but it still does not work...

Am I missing something ..some space ??

Corona688 · April 7, 2013, 6:20pm

In what way does it not work? What does it do?

sarah-alikhan31 · April 7, 2013, 6:29pm

The entire script given in my first post, when I execute it...(with this updated awk statement) , simply echoes the date patterns.

'dump_logs.txt' file is of 0 bytes, after executing the script .

Yoda · April 7, 2013, 7:21pm

Here is what I tried and it worked:

awk -v DT1="${dt_1}:[0-9][0-9]:[0-9][0-9] ${dt_2}" -v DT2="${dt_3}:[0-9][0-9]:[0-9][0-9] ${dt_4}" ' {
        n = match ( $0, DT1 )
        if ( n )
                flag = 1
        n = match ( $0, DT2 )
        if ( n )
                flag = 0
} flag ' logfile

Input File:

$ cat logfile
line1
Apr 07, 2013  5:00:00 PM
line3
line4
Apr 07, 2013  6:00:00 PM
line6
line7

Output:

$ ./sarah
Apr 07, 2013  5:00:00 PM
line3
line4

sarah-alikhan31 · April 7, 2013, 8:33pm

I think the problem lies with the calculation of $dt_1 and $dt_3

dt_1=`date +%h" "%d", "%Y\ %l -d  "1 hour ago"`
 
dt_3=`date +%h" "%d", "%Y\ %l`

The space between Y and l, is the culprit
I had given this space to accomodate 2 digit hour. For 1 digit hr, if there is no space between Y and l, the script works fine.
Otherwise, it does nothing..(as happening with me till now)

Your script worked because there are 2 spaces between the year and the hr in the log file that you have shown:
2013 5
and so it could match exactly.

If you could suggest something here....that would be much appreciated.
How should I modify my date format calculation to match in exact date in log file - bearing in mind 1 digit and 2 digit hr

Yoda · April 7, 2013, 8:54pm

Try with this modification:

dt_1=$( date +"%h %d, %Y[ ]*%l:[0-9][0-9]:[0-9][0-9] %p" -d "1 hour ago" )
dt_2=$( date +"%h %d, %Y[ ]*%l:[0-9][0-9]:[0-9][0-9] $p" )
awk -v DT1="${dt_1}" -v DT2="${dt_2}" ' {
        n = match ( $0, DT1 )
        if ( n )
                flag = 1
        n = match ( $0, DT2 )
        if ( n )
                flag = 0
} flag ' logfile

pravin27 · April 8, 2013, 7:44am

dt_1=`date +%h" "%d", "%Y -d  "1 hour ago"`  #Apr 05, 2013 1   --- if ran at 02 AM
dt_2=`date +%l -d  "1 hour ago"`
dt_3=`date +%p -d  "1 hour ago"`  # AM|PM
echo $dt_1
echo $dt_2
echo $dt_3
dt_4=`date +%h" "%d", "%Y`
dt_5=`date +%l 
dt_6=`date +%p`
echo $dt_4
echo $dt_5
echo $dt_6
awk  -v a="{$dt_1}" -v b="{$dt_2}" -v c="{$dt_3}" -v d="{$dt_4}" -v e="{$dt_5}" -v f="{$dt_6}"  '$0~/a int(b):[0-9][0-9]:[0-9][0-9] c/{flag=1;next} $0~/d int(e):[0-9][0-9]:[0-9][0-9] f/{flag=0} flag' /some/logfile >> /some/dump_logs.txt

sarah-alikhan31 · April 9, 2013, 11:46am

Thank you Yoda & pravin27

Yoda,

Your guidance helped....The change I made to the scriptlet was that
I omitted '$' sign from dt_1 and d2_2 calculation...and it worked.
Perhaps if you could tell me why you had assigned these variables inside a $ sign.?

Also,I wanted to know...what is the most efficient way to get my script (that I am working on ..) executed on multiple remote servers, one by one and get the result returned (in the form of a file) to my local server ?
something like this on my local server:
for server in `cat $serverlist.txt`
do
ssh user@$server < ./sarah.sh #sarah.sh is a local script
done >> result-on-local-server.txt

Yoda · April 9, 2013, 12:13pm

They are command substitution $( ... ) and is preferred over backticks ` ... `

That is dangerous backticks, I would recommend to use a while loop instead. Also use -n option to get ssh work inside a while loop:

while read server
do
    ssh -n ${user}@${server} "/path_to_your_script/sarah.sh"
done < serverlist.txt > result-on-local-server.txt