Hi all
I have a big file like this in rows and columns from 2 column onwards the next column is desciption of previous column means 3rd columns is description of 2 columns and 5 column is description of 4 column.
All cloumns are separated by comma
CHST3,docetaxel,xyznox,tyurppw,notavailble,docetaxel,xyznox,jfhdkg,notavailable
ESRT4,ghtscjgh,notavailable,Ghjfuti,notavailable,manhfd, kdcvgh,Ghjfuti,not available,manhfd, kdcvgh
I want to remove duplicates. The problem is I want it shuld check that whether n column entry equals to n+2 then n+2 and n+3 column should be reomve other wise not
so expected output is:
CHST3,docetaxel,xyznox,tyurppw,notavailble,jfhdkg,notavailable
ESRT4,ghtscjgh,notavailable,Ghjfuti,notavailable,manhfd,kdcvgh