Hi all,
I have been trying to delete duplicates based on a certain pattern but failed to make it works. There are more than 1 pattern which are duplicated but i just want to remove 1 pattern only and remain the rest. I cannot use awk '!x[$0]++' inputfile.txt or sed '/pattern/d' or use uniq and sort command as it will deleted all the duplicated patterns in the file. A sample as follows:
inputfile.txt
;;
;;
ID 701
NAME 701
FUNC Null
FUNC Null
FUNC Null
CC 27749
PRO A
NO NO:3676
NO NO:3677
NO NO:3723
NO NO:3964
COMMENT Nothing is impossible
@@
ID 702
NAME 702
FUNC Null
FUNC Null
FUNC Null
FUNC Null
PRO A
NO NO:3676
NO NO:3677
COMMENT Need to change
@@
ID 706
NAME 706
FUNC Null
PRO A
NO NO:6301
NO NO:6310
NO NO:6450
NO NO:6647
NO NO:6812
@@
I want to remove the duplicates for pattern "FUNC" only, where the output should look like this:
output.txt
;;
;;
ID 701
NAME 701
FUNC Null
CC 27749
PRO A
NO NO:3676
NO NO:3677
NO NO:3723
NO NO:3964
COMMENT Nothing is impossible
@@
ID 702
NAME 702
FUNC Null
PRO A
NO NO:3676
NO NO:3677
COMMENT Need to change
@@
ID 706
NAME 706
FUNC Null
PRO A
NO NO:6301
NO NO:6310
NO NO:6450
NO NO:6647
NO NO:6812
@@
I have thousands of data like this and i need to delete a different pattern at one time. I tried to do it by specifying the column no too but it affects other duplicated values which i dont want it to be affected. Appreciate your help on this. Thanks