removing lines with similar values from file

Hello,

got a file with this structure:

33274   171030  02/29/2012      37897   P_GEH   2012-02-29 10:31:26
33275   171049  02/29/2012      38132   P_GEH   2012-02-29 10:35:27
33276   171058  02/29/2012      38515   P_GEH   2012-02-29 10:43:26
33277   170748  02/29/2012      40685   P_KOM   2012-02-29 11:19:27
33278   170053  02/29/2012      41704   P_GEH   2012-02-29 11:35:27
33279   171042  02/29/2012      41983   P_GEH   2012-02-29 11:39:27
33280   170343  02/29/2012      42740   P_KOM   2012-02-29 11:55:27
33281   171030  02/29/2012      44042   P_KOM   2012-02-29 12:15:27
33282   171030  02/29/2012      44053   P_KOM   2012-02-29 12:15:27
33283   170748  02/29/2012      45453   P_GEH   2012-02-29 12:39:27
33284   170281  02/29/2012      45608   P_KOM   2012-02-29 12:43:26

I need to take look thru whole file searching for entries with duplicated numbers at column 2 and 5 and then compare there values from column 4. If the difference is less then 15 it should delete line with greater value.
Please help me out with that :confused:

1st you can store line by line in an array.
then fill another array with index of col5&col2 as name.
on index '0' you store value of col4 and on index '1' actual line index.
substract values of col4 and check for your restriction.
if have to delete you know the lineNo and you set the index of the 1st array to empty string.
at the end concatenate the lines of the array to a new file except of the indices containing empty strings...

Hi,

Try this one,

awk '{k=$2"^"$5;if( a[k] == "" ){a[k]=$0;f4[k]=$4;}else{d=$4-f4[k];if ( d < 15 ){a[k]="";}}}END{for( i in a ){print a;}}' file

Cheers,
Ranga:)

# awk '{a[x++]=$0};END{for(i=0;i<x;i=i++){f=i;split(a,b);i++;split(a,c)
if(b[2]b[5]==c[2]c[5]){
if(c[4]-b[4]<15&&c[4]-b[4]>0)
{delete a;d=i}
if(b[4]-c[4]<15&&b[4]-c[4]>0)
{delete a[f];d=f}else if(x!=i)print a[f]
}
else
if(d!=f||!d)print a[f]
}}' infile
33274   171030  02/29/2012      37897   P_GEH   2012-02-29 10:31:26
33275   171049  02/29/2012      38132   P_GEH   2012-02-29 10:35:27
33276   171058  02/29/2012      38515   P_GEH   2012-02-29 10:43:26
33277   170748  02/29/2012      40685   P_KOM   2012-02-29 11:19:27
33278   170053  02/29/2012      41704   P_GEH   2012-02-29 11:35:27
33279   171042  02/29/2012      41983   P_GEH   2012-02-29 11:39:27
33280   170343  02/29/2012      42740   P_KOM   2012-02-29 11:55:27
33281   171030  02/29/2012      44042   P_KOM   2012-02-29 12:15:27
33283   170748  02/29/2012      45453   P_GEH   2012-02-29 12:39:27
33284   170281  02/29/2012      45608   P_KOM   2012-02-29 12:43:26