Hi Folks -
I'm quite new to awk and didn't come across such issues before. The problem statement is that, I've a file with duplicate records in 3rd and 4th fields. The sample is as below:
aaaaaa|a12|45|56
abbbbaaa|a12|45|56
bbaabb|b1|51|45
bbbbbabbb|b2|51|45
aaabbbaaaa|a11|45|56
Here,the combination of field3 and field is same for few records viz. 4556 for the first 2 and last rows and so on..
Now,the output file is expected to be like this:
aaabbbaaaa|a11|45|56
bbbbbabbb|b2|51|45
That is, checking the length of first field for the rows where field3&field4 match and return the row with highest length in first field among them. So, one row will be picked from each set of duplicates based on the length on first field
Could you please help with a one line awk command to achieve this?