Hi all
I have following kind of input file
ESR1 PA156 leflunomide PA450192 leflunomide
CHST3 PA26503 docetaxel Pa4586; thalidomide Pa34958; decetaxel docetaxel docetaxel
I want to remove duplicates and I want to separate anything before and after PAxxxx entry into columns or anything separated by ; sign into columns so I will get data in columns like this
ESR1 PA156 leflunomide PA450192
CHST3 PA26503 docetaxel Pa4586 thalidomide Pa34958