how to find out recurrence and print it.

Hi all,

I have a file having data like this:

rs4332761    15XB
rs4332761    unk
rs4571228    15XB
rs457263    5XB
rs4606515    10XA
rs4606515    10XB
rs4606515    15XB

I want output like this:

rs4332761    15XB,unk
rs4571228    15XB
rs457263    5XB
rs4606515    10XA,10XB,15XB

I tried using simple regex options in notepad++ and others but not getting the exact thing. If anyone can suggest any small script or code, that will be much appreciated.

Thanks.

See if this AWK helps

awk ' { a[$1] = a[$1] FS $2 }END{ for (i in a) print i a}' input
1 Like

Thank you very much Peasant.
It worked nice. I just needed to replace space with , in output.
Thanks.