awk IF date comparison help

Hey everyone,

I'm trying to create a script using awk and if that will list all of our aws tapes that have archived date that is past 90 days from todays current date, so that I can pass that to my aws command to remove.

The fifth column is the creation date in epoch/seconds, so I'm trying to compare that with todays date minus 90 days, in seconds.

Without listing all of my tapes, I'll give a few example lines that aws storagegateway describe-tape-archives spits out

TAPEARCHIVES     1576094259.759   arn:aws:storagegateway:us-east-2:391499008633:tape/WEEKD02971    WEEKD02971      1575490816.054      2748779069440    ARCHIVED
TAPEARCHIVES     1573055398.804   arn:aws:storagegateway:us-east-2:391499008633:tape/WEEK587EF9    WEEK587EF9      1572283447.825      2748779069440    ARCHIVED

And then here's the command that I'm actually trying to use

aws  storagegateway describe-tape-archives | awk -v date="$(date +%s)" -v  date1="$(expr $date - 7776000)" '{ if ($5 > $date1) {print}}'

I was also trying this but getting the same results

aws storagegateway describe-tape-archives | awk -v date="$(date +%s)" '{ if ($5+7776000 < $date) {print}}

This should just return the WEEK587EF9 line, but instead it's returning both lines. I'm a little new to awk so I'm not sure if i have my comparisons messed up, or maybe I should be using another command.

Do you want today's nidnight, or the time of the script run, to be the start of the backward time period?

I'd like to end up having this in a cron job that runs daily at midnight.

So i'd like the time of the script run to be the start of the backwards time period.

None of your two archive dates in the second column is older than 90 days as of today (Mo 6. Jan 19:09:41 CET 2020)

Mi 11. Dez 20:57:39 CET 2019
Mi 6. Nov 16:49:58 CET 2019

*facepalm*

I mistyped that, I'll edit my main post above, I'm actually needing to compare it to the 5th column, which is the tape creation date.

None of your archive dates in the fifth column is older than 90 days

Mi 4. Dez 21:20:16 CET 2019
Mo 28. Okt 18:24:07 CET 2019
1 Like

Well darn, it's a monday, thanks for catching that.

In awk , the rand() function is seeded with the actual system epoch time, and srand() "returns the value of the previous seed" (c.f. man awk ). No need to supply an external datetimestamp. Try this

aws storagegateway describe-tape-archives | awk 'BEGIN { DT = srand() - 86400 * 90}; $5 < DT'

Hello, back at this again

So I know this tape is older than 90 days, the 2nd and 5th colum (archive and creation date) are both past 90 days.

TAPEARCHIVES 1572970010.744 arn:aws:storagegateway:us-east-2:391499008633:tape/WEEK5B7EFA WEEK5B7EFA 1572283447.819 2748779069440 ARCHIVED 2738042187776

I tried running aws storagegateway describe-tape-archives | awk 'BEGIN { DT = srand() - 86400 * 90}; $5 < DT'

and it's doesn't return any tapes. I think the issue that I'm having is that the epoch dates in the 2nd and 5th column are in a string and I need to get it converted to a int to compare?

Well, for me that line IS returned, be it $2 or $5 that are considered:

cat file | awk 'BEGIN { DT = srand() - 86400 * 90}; $5 < DT' 
TAPEARCHIVES 1572970010.744 arn:aws:storagegateway:us-east-2:391499008633:tape/WEEK5B7EFA WEEK5B7EFA 1572283447.819 2748779069440 ARCHIVED 2738042187776

, as both dates are more than 90 days past:

$ date -d@1572970010.744
Di 5. Nov 17:06:50 CET 2019
$ date -d@1572283447.819
 Mo 28. Okt 18:24:07 CET 2019

Sure that line is the output of aws storagegateway describe-tape-archives ?

I reran the command and I'm still unable to get the line to output. For testing, i created a test file that had just that line in it, and piped it into the awk command, and I'm still unable to get it to show up.

kruz@ces-lt-it-kg:~$ cat test
TAPEARCHIVES 1572970010.744 arn:aws:storagegateway:us-east-2:391499008633:tape/WEEK5B7EFA WEEK5B7EFA 1572283447.819 2748779069440 ARCHIVED 2738042187776
kruz@ces-lt-it-kg:~$ cat test | awk 'BEGIN { DT = srand() - 86400 * 90}; $5 < DT'
kruz@ces-lt-it-kg:~$

I am running these commands on the Windows Subsystem for Linux if that would make a difference? Maybe I have a different awk version that's behaving differently?

Here:

cat file
TAPEARCHIVES 1572970010.744 arn:aws:storagegateway:us-east-2:391499008633:tape/WEEK5B7EFA WEEK5B7EFA 1572283447.819 2748779069440 ARCHIVED 2738042187776
cat file | awk 'BEGIN { DT = srand() - 86400 * 90}; $5 < DT'
TAPEARCHIVES 1572970010.744 arn:aws:storagegateway:us-east-2:391499008633:tape/WEEK5B7EFA WEEK5B7EFA 1572283447.819 2748779069440 ARCHIVED 2738042187776

You might want to develop some creativity when it comes to debugging your scripts / programs. What would be the output of

awk 'BEGIN {print srand()}'

How does it compare to the values in your file? What would be the difference of $5 and DT ?

kruz@ces-lt-it-kg:~$ awk 'BEGIN {print srand()}'
1
kruz@ces-lt-it-kg:~$

So i tried substituting srand() with an external date variable, same results.

kruz@ces-lt-it-kg:~$ cat test | awk -v test="$(date +%s)" 'BEGIN { DT = $test - 7776000}; $5 < DT'
kruz@ces-lt-it-kg:~$

Doing the math DT = 1573594249, which is greater than 1572283447.819 ($5), so it should be outputting the line.

I ran through this outside of awk, and one thing I noticed, when I assigned test="$(date +%s)" and called test, it returned;

kruz@ces-lt-it-kg:~$ test="$(date +%s)"
kruz@ces-lt-it-kg:~$ $test
1581370249: command not found
kruz@ces-lt-it-kg:~$

Not a common awk implementation.

Good approach. Drop the $ before test inside awk :

$ awk -v test="$(date +%s)" 'BEGIN { DT = test - 7776000}; $5 < DT' file
 TAPEARCHIVES 1572970010.744 arn:aws:storagegateway:us-east-2:391499008633:tape/WEEK5B7EFA WEEK5B7EFA 1572283447.819 2748779069440 ARCHIVED 2738042187776

EDIT: You could even try

awk -v DT="$(date +%s -d"90 days ago")" '$5 < DT' file