Dear all,
How can I remove duplicated column in a text file?
Input:
LG10_PM_map_19_LEnd 1000560 G AA AA AA AA AA GG
LG10_PM_map_19_LEnd 1005621 G GG GG GG AA AA GG
LG10_PM_map_19_LEnd 1011214 A AA AA AA AA GG GG
LG10_PM_map_19_LEnd 1011673 T TT TT TT TT CC CC
LG10_PM_map_19_LEnd 1088961 C TT TT TT TT TT TT
LG10_PM_map_19_LEnd 1089024 G AA AA AA AA AA AA
LG10_PM_map_19_LEnd 1108301 C TT TT TT TT TT CC
LG10_PM_map_19_LEnd 11365128 G AA
LG10_PM_map_19_LEnd 11365170 T CC
LG10_PM_map_19_LEnd 11381744 A GG GG GG GG GG GG
LG10_PM_map_19_LEnd 11381772 T TT TT
LG10_PM_map_19_LEnd 11385851 A AA AA AA AA AA AA
LG10_PM_map_19_LEnd 11386265 A AA AA AA AA AA AA
LG10_PM_map_19_LEnd 1138663 T AA TT AA AA TT TT
Output:
LG10_PM_map_19_LEnd 1000560 G AA GG
LG10_PM_map_19_LEnd 1005621 G GG AA
LG10_PM_map_19_LEnd 1011214 A AA GG
LG10_PM_map_19_LEnd 1011673 T TT CC
LG10_PM_map_19_LEnd 1088961 C TT
LG10_PM_map_19_LEnd 1089024 G AA
LG10_PM_map_19_LEnd 1108301 C TT CC
LG10_PM_map_19_LEnd 11365128 G AA
LG10_PM_map_19_LEnd 11365170 T CC
LG10_PM_map_19_LEnd 11381744 A GG
LG10_PM_map_19_LEnd 11381772 T TT
LG10_PM_map_19_LEnd 11385851 A AA
LG10_PM_map_19_LEnd 11386265 A AA
LG10_PM_map_19_LEnd 1138663 T AA TT
---------- Post updated at 02:27 AM ---------- Previous update was at 01:24 AM ----------
Hi all,
I figured out with awk:
awk '{ while(++i<=NF) printf (!a[$i]++) ? $i FS : ""; i=split("",a); print ""}' input > output