Hello,
got a file with this structure:
33274 171030 02/29/2012 37897 P_GEH 2012-02-29 10:31:26
33275 171049 02/29/2012 38132 P_GEH 2012-02-29 10:35:27
33276 171058 02/29/2012 38515 P_GEH 2012-02-29 10:43:26
33277 170748 02/29/2012 40685 P_KOM 2012-02-29 11:19:27
33278 170053 02/29/2012 41704 P_GEH 2012-02-29 11:35:27
33279 171042 02/29/2012 41983 P_GEH 2012-02-29 11:39:27
33280 170343 02/29/2012 42740 P_KOM 2012-02-29 11:55:27
33281 171030 02/29/2012 44042 P_KOM 2012-02-29 12:15:27
33282 171030 02/29/2012 44053 P_KOM 2012-02-29 12:15:27
33283 170748 02/29/2012 45453 P_GEH 2012-02-29 12:39:27
33284 170281 02/29/2012 45608 P_KOM 2012-02-29 12:43:26
I need to take look thru whole file searching for entries with duplicated numbers at column 2 and 5 and then compare there values from column 4. If the difference is less then 15 it should delete line with greater value.
Please help me out with that
1st you can store line by line in an array.
then fill another array with index of col5&col2 as name.
on index '0' you store value of col4 and on index '1' actual line index.
substract values of col4 and check for your restriction.
if have to delete you know the lineNo and you set the index of the 1st array to empty string.
at the end concatenate the lines of the array to a new file except of the indices containing empty strings...
Hi,
Try this one,
awk '{k=$2"^"$5;if( a[k] == "" ){a[k]=$0;f4[k]=$4;}else{d=$4-f4[k];if ( d < 15 ){a[k]="";}}}END{for( i in a ){print a;}}' file
Cheers,
Ranga:)
# awk '{a[x++]=$0};END{for(i=0;i<x;i=i++){f=i;split(a,b);i++;split(a,c)
if(b[2]b[5]==c[2]c[5]){
if(c[4]-b[4]<15&&c[4]-b[4]>0)
{delete a;d=i}
if(b[4]-c[4]<15&&b[4]-c[4]>0)
{delete a[f];d=f}else if(x!=i)print a[f]
}
else
if(d!=f||!d)print a[f]
}}' infile
33274 171030 02/29/2012 37897 P_GEH 2012-02-29 10:31:26
33275 171049 02/29/2012 38132 P_GEH 2012-02-29 10:35:27
33276 171058 02/29/2012 38515 P_GEH 2012-02-29 10:43:26
33277 170748 02/29/2012 40685 P_KOM 2012-02-29 11:19:27
33278 170053 02/29/2012 41704 P_GEH 2012-02-29 11:35:27
33279 171042 02/29/2012 41983 P_GEH 2012-02-29 11:39:27
33280 170343 02/29/2012 42740 P_KOM 2012-02-29 11:55:27
33281 171030 02/29/2012 44042 P_KOM 2012-02-29 12:15:27
33283 170748 02/29/2012 45453 P_GEH 2012-02-29 12:39:27
33284 170281 02/29/2012 45608 P_KOM 2012-02-29 12:43:26