Hi,
I am trying to use awk to remove all rows where the first 4 fields are duplicates. e.g. in the following data lines 6-9 would be removed, leaving one copy of the duplicated row (row 5)
Borgarhraun FH9822 ol24 FH9822_ol24_m20 ol Deformed c
Borgarhraun FH9822 ol24 FH9822_ol24_r21 ol Deformed r
Borgarhraun FH9822 ol25 FH9822_ol25_m22 ol Res. B c
Borgarhraun FH9822 ol25 FH9822_ol25_r23 ol Res. B r
Borgarhraun FH9822 ol24 FH9822_ol24_profCD ol Deformed c
Borgarhraun FH9822 ol24 FH9822_ol24_profCD ol Deformed c
Borgarhraun FH9822 ol24 FH9822_ol24_profCD ol Deformed c
Borgarhraun FH9822 ol24 FH9822_ol24_profCD ol Deformed c
Borgarhraun FH9822 ol24 FH9822_ol24_profCD ol Deformed c
Borgarhraun FH9822 ol35 FH9822_ol35_m24 ol Res. B c
so the output would hopefully look like
Borgarhraun FH9822 ol24 FH9822_ol24_m20 ol Deformed c
Borgarhraun FH9822 ol24 FH9822_ol24_r21 ol Deformed r
Borgarhraun FH9822 ol25 FH9822_ol25_m22 ol Res. B c
Borgarhraun FH9822 ol25 FH9822_ol25_r23 ol Res. B r
Borgarhraun FH9822 ol24 FH9822_ol24_profCD ol Deformed c
Borgarhraun FH9822 ol35 FH9822_ol35_m24 ol Res. B c
Can anyone help? Thanks