Awk: Replacement using 2 diff files input and comparison

Requirement:
If $5(date field) in ipfile is less than $7(date field) in deact file & $1 of ipfile is present in deactfile then
$1 to be replaced by $2,$3,$4,$5,$6 of deact file
else if $5(date field) in ipfile is greater than $7(date field) in actfile & $1 of ipfile is present in actfile then
$1 to be replaced by $2,$3,$4,$5,$6 of actfile.

IPFILE

8104665534|2|404001742933540|180.215.139.144|20140114100944|20140114101027|43|20140114101045|3756|13054|31|26|59|44700kth|70|1|internet@internet.mtsindia.in|34661
8104665534|2|404001742933541|180.215.139.144|20131114101045|20131115101045|0|20131115101045|0|0|0|0|59|44700kth|68|1|internet@internet.mtsindia.in|34661
8
9136713486|2|405891614985516|180.215.176.110|20140114093453|20140114100922|2069|20140114101001|1658883|196181|114597376|3801088|59|46600g0W|70|1|internet@internet.mtsindia.in|32852
9136713486|2|405891614985516|180.215.176.110|20130114093453|20140114094336|523|20140114101001|9270|1799|359414536|2339687199|59|46600g0W|70|1|internet@internet.mtsindia.in|32852
9136713486|2|405891614985516|180.215.176.110|20130911093509|20140114093509|0|20140114101001|0|416|1157627956|16384|59|46600g0W|70|1|internet@internet.mtsindia.in|32852
9

ACTFILE:

8104665534|1903726104|1|8182|935674|MB|20140106|2
9136713486|1903743569|1|8182|936308|MB|20140101|6

DEACTFILE:

8104665534|1903739454|1|8453|185670|MB|20131214|8
9136713486|1903713493|1|8453|151012|MB|20130215|8

Present Code which only does replacement based on $1 checking and not on $5(date field) of ipfile:

awk -F"|" 'FNR==1 {++counter}
counter==1 {ACTFILE[$1]=$2"|"$3"|"$4"|"$5"|"$6;next}
counter==2 {DEACTFILE[$1]=$2"|"$3"|"$4"|"$5"|"$6;next}
{
if ( $1 in ACTFILE)
print ACTFILE[$1],$8,$9+$10> "ok.txt"
else if ( $1 in DEACTFILE)
print DEACTFILE[$1],$8,$9+$10> "nok.txt"
}
' FS="|" OFS="|" ACTFILE DEACTFILE IPFILE

Please help as i am unable to put the date field checks in the code.

Since you are only comparing dates and one date is in YYmmddHHMMSS format and the other in YYmmdd, you could just slap "000000" at the end of the latter and do a numerical comparison.

1 Like

Thanks scrutinizer, I will use substr for the date field for the numerical comparison, but getting confused in how to place the required checks in awk(existing code shared), can you suggest the condition for the same in code?

You can try something like this, as Scrutinizer already suggested

awk    '
  
       FNR==1{
                    ++counter
             }
  counter==1 {
                ACTFILE[$1]=$2 FS $3 FS $4 FS $5 FS $6 FS $7
                next
             }
  counter==2 {
                DEACTFILE[$1]=$2 FS $3 FS $4 FS $5 FS $6 FS $7
                next
             }
function date_format(vin){
                             # For 20140114100944 
                             year = substr(vin,1,4)
                             mon  = substr(vin,5,2)
                             day  = substr(vin,7,2)
                             hr   = substr(vin,9,2)  ? substr(vin,9,2) : 00
                             min  = substr(vin,11,2) ? substr(vin,11,2): 00
                             sec  = substr(vin,13)   ? substr(vin,13)  : 00
                             return year" "mon" "day" "hr" "min" "sec
                         }
              {
                 timeip = $5 ? mktime(date_format($5)) : NULL
                      if( $1 in DEACTFILE){
                                           n =split(DEACTFILE[$1],A)
                                           dfiletime = mktime(date_format(A[n]))
                                          if(timeip < dfiletime)
                                              print A[1],A[2],A[3],A[4],A[5],$8,$9+$10 >"ok_file"
                
                                          } 
                      if( $1 in ACTFILE)  {
                                           n =split(ACTFILE[$1],A)
                                           actfiletime = mktime(date_format(A[n]))
                                           if(timeip > actfiletime)
                                               print A[1],A[2],A[3],A[4],A[5],$8,$9+$10 >"No_ok_file" 
                                          }
              }
           ' FS="|" OFS="|" act deact ip
1 Like
 awk -F"|" 'FNR==1{++cnt}
 cnt==1{ACTFILE[$1]=$2"|"$3"|"$4"|"$5"|"$6;a[$1]=$7"000000";next}
cnt==2{DEACTFILE[$1]=$2"|"$3"|"$4"|"$5"|"$6;d[$1]=$7"000000";next}
{
if (ACTFILE[$1] && $5 gt a[$1] ) {print ACTFILE[$1],$8,$9+$10} else if (DEACTFILE[$1] && $5 lt d[$1] ) {print DEACTFILE[$1],$8,$9+$10}
}' OFS="|" ACTFILE DEACTFILE IPFILE
1 Like

Thanks Akshay. The code did worked with minor modifications. Just need another advice in case there are multiple entries in DEACT file then how to handle such scenario based on the different input. Please refer the updated deact file & updated ipfile.

DEACTFILE:

8104665534|1903739454|1|8453|185670|MB|20131214|8
9136713486|1903713493|1|8453|151012|MB|20130215|8
9136713486|1903713580|1|8453|151012|MB|20130118|8
9136713486|1903713786|1|8453|151012|MB|20130125|8

IPFILE:

8104665534|2|404001742933540|180.215.139.144|20140114100944|20140114101027|43|20140114101045|3756|13054|31|26|59|44700kth|70|1|internet@internet.mtsindia.in|34661
8104665534|2|404001742933541|180.215.139.144|20131114101045|20131115101045|0|20131115101045|0|0|0|0|59|44700kth|68|1|internet@internet.mtsindia.in|34661
9136713486|2|405891614985516|180.215.176.110|20140114093453|20140114100922|2069|20140114101001|1658883|196181|114597376|3801088|59|46600g0W|70|1|internet@internet.mtsindia.in|32852
9136713486|2|405891614985516|180.215.176.110|20130114093453|20140114094336|523|20140114101001|9270|1799|359414536|2339687199|59|46600g0W|70|1|internet@internet.mtsindia.in|32852
9136713486|2|405891614985516|180.215.176.110|20130911093509|20140114093509|0|20140114101001|0|416|1157627956|16384|59|46600g0W|70|1|internet@internet.mtsindia.in|32852
9136713486|2|405891614987865|180.215.176.110|20130120093453|20140114094336|523|20140114101001|9270|1799|359414536|2339687199|59|46600g0W|70|1|internet@internet.mtsindia.in|32852
 

Modified code:

awk    '
  
       FNR==1{ ++counter }
  counter==1 { ACTFILE[$1]=$2 FS $3 FS $4 FS $5 FS $6 FS $7;next}
  counter==2 { DEACTFILE[$1]=$2 FS $3 FS $4 FS $5 FS $6 FS $7; next}
function date_format(vin)
{
                             # For 20140114100944 
                             year = substr(vin,1,4)
                             mon  = substr(vin,5,2)
                             day  = substr(vin,7,2)
                             hr   = substr(vin,9,2)  ? substr(vin,9,2) : 00
                             min  = substr(vin,11,2) ? substr(vin,11,2): 00
                             sec  = substr(vin,13)   ? substr(vin,13)  : 00
                             return year" "mon" "day" "hr" "min" "sec
}
 
{
   timeip = $5 ? mktime(date_format($5)) : NULL
print timeip
   n =split(ACTFILE[$1],A)
   actfiletime = mktime(date_format(A[n])) 
   n =split(DEACTFILE[$1],B)
   dfiletime = mktime(date_format(B[n]))
 
   if( $1 in ACTFILE && timeip > actfiletime)
                print A[1],A[2],A[3],A[4],A[5],$8,$9+$10 >"AACT_file"
   else if( $1 in DEACTFILE && timeip < dfiletime )  
                print B[1],B[2],B[3],B[4],B[5],$8,$9+$10 >"DACT_file"                                         
  else 
                print $0 > "Reject_file"
}
' FS="|" OFS="|" act deact ip