Extract duplicate rows with conditions

Gents

Can you help please.

Input file

5490921425          1          7    1310342 54909214251
5490921425          2          1          1 54909214252
5491120937          1          1          3 54911209371
5491120937          3          1          1 54911209373
5491320785          1          7    1305158 54913207851
5491320785          2          1          1 54913207852
5491521081          1         49    1307593 54915210811
5491521081          2         49    1307593 54915210812
5491521089          1          1          2 54915210891
5491521089          2         49    1307655 54915210892
5508520753          1          1          3 55085207531
5508520753          2          1          3 55085207532
5508521065          1          1          0 55085210651
5508521065          1          1          4 55085210651
5508521089          1          1          1 55085210891
5508521089          2          1          1 55085210892
5508720777          1          1          1 55087207771
5508720777          2          1          3 55087207772
5508721325          1          7    1311208 55087213251
5508721325          2          1          4 55087213252

Output file
Using this code

awk X[$1] {print X[$1]}{ X[$1]=$0} file

I got this output

5490921425          1          7    1310342 54909214251
5491120937          1          1          3 54911209371
5491320785          1          7    1305158 54913207851
5491521081          1         49    1307593 54915210811
5491521089          1          1          2 54915210891
5508520753          1          1          3 55085207531
5508521065          1          1          0 55085210651
5508521089          1          1          1 55085210891
5508720777          1          1          1 55087207771
5508721325          1          7    1311208 55087213251

Desired output

Conditions to get desired output file.
Get all duplicate rows with following conditions.
1.- Maximum value in column 3

5490921425          1          7    1310342 54909214251
5491120937          1          1          3 54911209371
5491320785          1          7    1305158 54913207851
5491521081          1         49    1307593 54915210811
5491521089          2         49    1307655 54915210892
5508520753          1          1          3 55085207531
5508521065          1          1          0 55085210651
5508521089          1          1          1 55085210891
5508720777          1          1          1 55087207771
5508721325          1          7    1311208 55087213251

Thanks

How about

sort -rnk1,1 -k3,3 file | awk 'X[$1] {print X[$1]}   {X[$1]=$0}'

Yes, it works. :slight_smile:
Thanks a lot

---------- Post updated at 02:20 PM ---------- Previous update was at 10:22 AM ----------

Dear Rudi C,

I notice that it not works complete fine.

using the sort i got

5490921425          2          1          1 54909214252
5491120937          3          1          1 54911209373
5491320785          2          1          1 54913207852
5491521081          2         49    1307593 54915210812
5491521089          2         49    1307655 54915210892
5508520753          2          1          3 55085207532
5508521065          1          1          4 55085210651
5508521089          2          1          1 55085210892
5508720777          2          1          3 55087207772
5508721325          2          1          4 55087213252

and I should get

5490921425          1          7    1310342 54909214251
5491120937          1          1          3 54911209371
5491320785          1          7    1305158 54913207851
5491521081          1         49    1307593 54915210811
5491521089          2         49    1307655 54915210892
5508520753          1          1          3 55085207531
5508521065          1          1          0 55085210651
5508521089          1          1          1 55085210891
5508720777          1          1          1 55087207771
5508721325          1          7    1311208 55087213251

In column 2 many changes, should keep like the example desired
Please help me, thanks

Not sure I understand, but this seems to get quite close:

sort -nk1,1 -k3,3r -k2,2 file3 | awk 'X[$1] {print X[$1]}   {X[$1]=$0}'
5490921425          1          7    1310342 54909214251
5491120937          1          1          3 54911209371
5491320785          1          7    1305158 54913207851
5491521081          1         49    1307593 54915210811
5491521089          2         49    1307655 54915210892
5508520753          1          1          3 55085207531
5508521065          1          1          0 55085210651
5508521089          1          1          1 55085210891
5508720777          1          1          1 55087207771
5508721325          1          7    1311208 55087213251

RudiC,

Yest it works now.. thanks