Hellow,
I have a tab-delimited file with 3 columns :
BINPACKER.13259.1.p2 SSF48239
BINPACKER.13259.1.p2 PF13243
BINPACKER.13259.1.p2 G3DSA:1.50.10.20
BINPACKER.13259.2.p2 SSF48239
BINPACKER.13259.2.p2 PF13243
BINPACKER.13259.2.p2 G3DSA:1.50.10.20
BINPACKER.31705.4.p1 PF00176 GO:0005524
BINPACKER.31705.4.p1 SM00490
BINPACKER.31705.4.p1 SSF52540
BINPACKER.31705.4.p1 G3DSA:3.40.50.300
BINPACKER.31705.4.p1 mobidb-lite
BINPACKER.31705.4.p1 SM00487
BINPACKER.31705.4.p1 PS51194
BINPACKER.31705.4.p1 cd00079
BINPACKER.31705.4.p1 PF00271
BINPACKER.31705.4.p1 PS51192
BINPACKER.31705.4.p1 cd00046
BINPACKER.31705.4.p1 G3DSA:3.40.50.10810
BINPACKER.31705.4.p1 SSF52540
BINPACKER.9719.7.p1 PF00443 GO:0016579|GO:0036459
BINPACKER.9719.7.p1 SSF57850
BINPACKER.9719.7.p1 PS50235
BINPACKER.9719.7.p1 mobidb-lite
BINPACKER.9719.7.p1 PF02148 GO:0008270
BINPACKER.9719.7.p1 SSF54001
BINPACKER.9719.7.p1 mobidb-lite
BINPACKER.9719.7.p1 cd02669 GO:0000245|GO:0006397
BINPACKER.9719.7.p1 PS50271 GO:0008270
BINPACKER.9719.7.p1 SM00290 GO:0008270
BINPACKER.9719.7.p1 mobidb-lite
BINPACKER.9719.7.p1 mobidb-lite
BINPACKER.9719.7.p1 G3DSA:3.30.40.10
BINPACKER.9719.7.p1 G3DSA:3.90.70.10
BINPACKER.937.4.p1 PS51032 GO:0003700|GO:0006355
BINPACKER.937.4.p1 PIRSF038123 GO:0003700
BINPACKER.937.4.p1 cd00018 GO:0003700|GO:0006355
BINPACKER.937.4.p1 SSF54171 GO:0003677
BINPACKER.937.4.p1 G3DSA:3.30.730.10 GO:0003700|GO:0006355
BINPACKER.937.4.p1 PR00367 GO:0003700|GO:0006355
I want to mege the rows based on first column with same ID. In column 2, I want only ID starting with PF and in 3rd column, want to concatenate all GO term seperated with comma. in each case there should be no duplicate eg:
BINPACKER.13259.1.p2 PF13243 NA
BINPACKER.13259.2.p2 PF13243 NA
BINPACKER.31705.4.p1 PF00176,PF00271 GO:0005524
BINPACKER.9719.7.p1 PF00443,PF02148 GO:0016579,GO:0036459,GO:0008270,GO:0000245,GO:0006397
BINPACKER.937.4.p1 NA GO:0003700,GO:0006355,GO:0003677
Thankyou