Hello,
I have a tab delim file that looks like this
CHROM POS ID REF ALT ID HGVS_C HGVS_P
1 17319011 rs2076603 G A NM_022089.3,NM_001141973.2,NM_001141974.2 c.1815C>T,c.1800C>T,c.1800C>T p.Pro605Pro,p.Pro600Pro,p.Pro600Pro
1 20960230 rs45530340 C T NM_032409.2,NR_106732.1 c.189C>T,n.59C>T p.Leu63Leu,.
1 20964328 rs2298298 A G NM_032409.2,NR_106732.1,NR_046507.1 c.388-7A>G,n.*4047A>G,n.*4822T>C .,.,.
1 20972048 rs3131713 G A NM_032409.2,NR_046507.1 c.960-5G>A,n.3981+30C>T .,.
1 43395635 rs2229682 C T NM_006516.2 c.588G>A p.Pro196Pro
1 43396414 rs11537641 G A NM_006516.2 c.399C>T p.Cys133Cys
1 43408966 rs1385129 G A NM_006516.2 c.45C>T p.Ala15Ala
I need the output file to look like this where ID column is merged with the last two. The columns do not have a consistent number of values in it, sometimes its just a single value, sometimes many and sometimes it's a blank.
CHROM POS ID REF ALT ID:HGVS_C ID:HGVS_P
1 17319011 rs2076603 G A NM_022089.3:c.1815C>T,NM_001141973.2:c.1800C>T,NM_001141974.2:c.1800C>T NM_022089.3:p.Pro605Pro,NM_001141973.2:p.Pro600Pro,NM_001141974.2:p.Pro600Pro
1 20960230 rs45530340 C T NM_032409.2:c.189C>T,NR_106732.1:n.59C>T NM_032409.2:p.Leu63Leu,.
1 20964328 rs2298298 A G NM_032409.2:c.388-7A>G,NR_106732.1:n.*4047A>G,NR_046507.1:n.*4822T>C
1 20972048 rs3131713 G A NM_032409.2:c.960-5G>A,NR_046507.1:n.3981+30C>T
1 43395635 rs2229682 C T NM_006516.2:c.588G>A NM_006516.2:p.Pro196Pro
1 43396414 rs11537641 G A NM_006516.2:c.399C>T NM_006516.2:p.Cys133Cys
1 43408966 rs1385129 G A NM_006516.2:c.45C>T NM_006516.2:p.Ala15Ala
I tried this
awk '{print $1"\t"$2"\t"$3"\t"$4"\t"$5"\t"$6":"$7"\t"$6":"$8}' input.txt > output.txt
It works for single values (ie last 3-4 rows) but I am not able to merge multiple values together (first few rows).
Any help/suggestions are appreciated.
thank you