Transpose Second column only

Hi Folks,

My input file is like this

cat input

abcd:efgh:jklm 123,456,67,78,89,90
hi:kil:op 76,78,12,3456,
unix:linux:shell:bash 111,111

My expected output

abcd:efgh:jklm hi:kil:op unix:linux:shell:bash 
123 76 111
456 78 111
67 12
78 3456
89
90

My trial

awk '{ for (i = 1; i <= NF; i++) f = f " " $i ; if (NF > n) n = NF } END { for (i = 1; i <= n; i++) sub(/^  */, "", f) ;for (i = 1; i <= n; i++) print f }' input

Try

awk     'NF>n           {n=NF}
                        {for (i=1;i<=NF;i++) TMP[NR,i]=$i}
         END            {for (i=1;i<=n;i++)
                           {for (j=1; j<=NR; j++) printf "%s ", TMP [j,i]
                            printf "\n"
                           }
                        }
        ' FS="[ ,]" file
abcd:efgh:jklm hi:kil:op unix:linux:shell:bash 
123 76 111 
456 78 111 
67 12  
78 3456  
89   
90   
1 Like

Thanks Rudic. This code works perfect on the sample input.

But worked partially on my actual file.

The records are not being published to the output properly. I see varying number of records under the columns. For ex: instead of 2 rows under the third column in my sample output, I see only two.

Any pointers?

Give us an accurate input file.

Also, if you have a problem it would be very helpful to:

Post the input lines that are not giving a correct output.
Post the output that it is not correct.

Here is the actual input files first two lines

GO:0044699:single-organism_process:1.94262300468481e-19:7.12366338351597e-23:437:635:12657:25412:0.00%:0.00	DAF2,5830411N06RIK,TBX21,C1QL3,PFN2,GNAQ,EBI3,LRRK1,GPR133,ELAVL3,TNFRSF8,DTL,SRC,TUB,MEGF8,ALOX8,PRKAR1B,CYBB,DYNLT3,CXCR3,TTN,MTAP2,PDE3B,SRXN1,CYP2S1,CXCL2,ADAP1,ADSSL1,ST3GAL6,SGK3,FUCA2,MAGED1,PON3,DAB2IP,RAP1GAP,ARAP3,SLC22A15,MBOAT1,GUCY1B3,IRF5,PALLD,SLC5A5,ATP6V0D2,ACOT7,FGD6,FGR,SEMA6D,MYO6,COLQ,GAS2L1,IL17RB,ITGB3,CCL5,BHLHE40,HMGN3,BCAT1,NAPRT1,SAMSN1,SLC43A3,NELF,APP,TMEM38B,HPN,DST,DPT,TGFBR3,ALS2CL,TOP2A,CHSY1,SMPDL3B,PCYT1B,ACVRL1,PKDCC,CACNA1I,CYP4V3,LIMA1,CELA1,BCL2A1D,SLC12A8,KLF2,CD81,CC2D2A,ICA1,DUS2L,TRIM2,MGMT,SOSTDC1,GJB2,BCL2L2,MAN1C1,RGL1,GPM6B,GPR132,PDLIM7,IGFBP4,INPPL1,LTB4R1,MYO1D,ARHGAP29,CSDA,LEF1,RHOD,PRR5L,TNFSF14,CCDC40,PRDM5,AMPD1,NLRC4,CD9,ITGA7,DECR1,MRVI1,H2-OA,REPIN1,ANXA4,TNFRSF1B,GSTT3,ITGB5,AREG,NPAS4,CNGA1,SELENBP1,EPB4.1L1,ACPP,MYO1E,SLC25A13,MAPK8IP1,CD160,CD55,PRF1,ST8SIA1,GUCY2E,NAIP1,KLRG1,PLCB4,FAM101B,DDR1,SYNPO,SEMA4A,RIAN,GGT7,NAIP2,ITGAE,PIK3AP1,STXBP1,BSPRY,MIAT,NR1D1,KIF11,TNFSF10,HSPB1,CYFIP1,HSF4,GUCY1A3,LITAF,WISP1,ANXA2,OAS2,NLRP1A,ADORA2A,RENBP,GPR35,SEC16B,SATB1,SOCS5,TIAM1,PELI2,NTNG2,PLSCR1,CD80,PLXNB2,CAPN3,LAMC1,TXNIP,FNDC3B,CEP290,AMIGO2,ACOT11,SEMA7A,TRIB3,ALCAM,SCIN,PENK,KCNMB4,CASP1,RORC,REC8,ACTN1,IFT57,KCNA2,ITGB8,RORA,TIGIT,NEO1,OSBPL3,CASP3,FIGNL1,PTPRJ,AKR1E1,CACNA1D,MELK,DGKG,PNKD,TUBB2A,BCL2A1B,CLIP3,DCXR,ATP1A3,CABLES1,TXK,IL18RAP,ZAN,VIPR1,SHANK1,ILDR1,BATF,ATCAY,IKZF3,CTNND1,BMP7,NR3C2,GPR56,SRGAP3,XDH,SIGLEC15,SEMA4C,TNFRSF9,NDRG1,NHEDC2,AIFM2,PLEKHA8,HDAC9,DNASE1L3,AVPR2,TMEM231,PYGM,4930506M07RIK,CHST11,MKI67,ENO2,TMEM132A,PLOD2,RAPGEF4,PDCD1,CNN3,NRP1,PMAIP1,SLC24A3,CXCL10,FAM109B,TEX15,PTPN3,AXL,PPAP2A,CYP2D22,DSCAM,TMEM2,EXPH5,PLEK,NCF1,PVRL1,TSHZ2,RAPGEF5,KIF5C,GPR55,TMEM67,GNA14,ZCCHC18,PEX11A,TNFSF11,LAG3,PDE4A,ALDOC,FBXW8,PLAGL1,SPNB1,PCTP,ST14,NOXA1,SLC15A3,JDP2,ST3GAL5,B3GALT5,PCGF2,GDPD5,PLAUR,SLC30A4,HOXB1,RAPH1,PLXNA3,RASSF4,PEAR1,CD8A,SLC39A4,RGMB,BICD1,HIF3A,ITGB1,RYK,UGCG,SCAMP1,PLA2G4F,PTAFR,ATP1B1,IL1RL1,FAM20A,SERPINC1,ADCK3,GCNT1,EEA1,CCRL2,KIF13A,IL13RA1,EPCAM,MID2,WEE1,RNF128,DUSP16,ACSBG1,LTBP4,KCNK6,WDR35,IFT122,SLC9A7,MMP25,PDCD1LG2,RASGEF1A,FABP5,CASP4,CD22,LAMA5,HAP1,DAPL1,PADI3,ALS2CR4,HPSE,SPOCK2,SIX5,CD38,ABCB1A,HIVEP3,NAIP7,RAB39B,NRN1,PRNP,DOCK4,RGS16,TMEM176A,DGAT2,CD79B,BC026585,NACC2,IFT81,GRHL1,CACNA1S,PLCL1,DAPK1,IFNG,ARHGEF5,DIAP3,HGF,PLXNB1,SCCPDH,PGLYRP1,AHR,APLP1,MAD1L1,AFP,TWSG1,LYN,WLS,NRBP2,NLGN2,TRIB2,EGR2,KLF7,MYL6B,Foxp3,SLC36A4,ATP9A,FUT10,LBP,RABL5,SLC16A3,NLRP1B,ABHD4,CCR8,CYSLTR2,BAIAP3,ENPP5,TRPM1,SWAP70,CHDH,ART3,ERGIC1,HDAC11,T2,BCL2A1A,MATN2,CSF1,SCG5,IL6RA,PASK,TBC1D4,CAMK2A,SLC2A6,C3,GGT5,PTPRS,INSRR,NRP2,F13A1,IL9R,RLN3,CD40LG,FCRL1,GRM6,TICAM2,CD83,AMOT,SERPINA3G,MAPRE3,SLC22A2,PIP5K1B,PLEKHG3,PGCP,SLC22A17,TRIM16,NPNT,AHI1,TMEM146,RAD51L1,RYR2,ICAM5,RNF43,RUNX2,ASB2
GO:0044763:single-organism_cellular_process:2.15608710343398e-19:7.90644335692695e-23:400:635:11146:25412:0.00%:0.00	5830411N06RIK,TBX21,C1QL3,PFN2,GNAQ,EBI3,LRRK1,GPR133,ELAVL3,TNFRSF8,DTL,SRC,TUB,MEGF8,ALOX8,CYBB,DYNLT3,CXCR3,TTN,MTAP2,PDE3B,CXCL2,ADAP1,ADSSL1,ST3GAL6,SGK3,MAGED1,PON3,DAB2IP,RAP1GAP,ARAP3,SLC22A15,MBOAT1,GUCY1B3,IRF5,PALLD,SLC5A5,ATP6V0D2,ACOT7,FGD6,FGR,SEMA6D,MYO6,COLQ,GAS2L1,IL17RB,ITGB3,CCL5,HMGN3,BCAT1,NAPRT1,SAMSN1,SLC43A3,NELF,APP,TMEM38B,HPN,DST,DPT,TGFBR3,ALS2CL,TOP2A,CHSY1,SMPDL3B,PCYT1B,ACVRL1,PKDCC,CACNA1I,CYP4V3,LIMA1,CELA1,BCL2A1D,SLC12A8,KLF2,CD81,CC2D2A,ICA1,DUS2L,TRIM2,MGMT,SOSTDC1,GJB2,BCL2L2,RGL1,GPM6B,GPR132,PDLIM7,IGFBP4,INPPL1,LTB4R1,MYO1D,ARHGAP29,CSDA,LEF1,RHOD,PRR5L,TNFSF14,CCDC40,PRDM5,AMPD1,NLRC4,CD9,ITGA7,DECR1,MRVI1,H2-OA,REPIN1,TNFRSF1B,GSTT3,ITGB5,AREG,NPAS4,CNGA1,SELENBP1,EPB4.1L1,ACPP,MYO1E,SLC25A13,MAPK8IP1,CD160,PRF1,ST8SIA1,GUCY2E,NAIP1,KLRG1,PLCB4,FAM101B,DDR1,SYNPO,SEMA4A,GGT7,NAIP2,ITGAE,PIK3AP1,STXBP1,MIAT,NR1D1,KIF11,TNFSF10,HSPB1,CYFIP1,HSF4,GUCY1A3,LITAF,WISP1,ANXA2,OAS2,NLRP1A,ADORA2A,RENBP,GPR35,SEC16B,SATB1,SOCS5,TIAM1,PELI2,NTNG2,PLSCR1,CD80,PLXNB2,CAPN3,LAMC1,TXNIP,FNDC3B,CEP290,AMIGO2,ACOT11,SEMA7A,TRIB3,ALCAM,SCIN,PENK,KCNMB4,CASP1,RORC,REC8,ACTN1,IFT57,KCNA2,ITGB8,RORA,TIGIT,NEO1,CASP3,FIGNL1,PTPRJ,CACNA1D,MELK,DGKG,PNKD,TUBB2A,BCL2A1B,CLIP3,DCXR,ATP1A3,CABLES1,TXK,IL18RAP,ZAN,VIPR1,SHANK1,ILDR1,BATF,ATCAY,IKZF3,CTNND1,BMP7,NR3C2,GPR56,SRGAP3,XDH,SIGLEC15,SEMA4C,TNFRSF9,NDRG1,NHEDC2,AIFM2,HDAC9,DNASE1L3,AVPR2,TMEM231,PYGM,4930506M07RIK,CHST11,MKI67,ENO2,TMEM132A,RAPGEF4,PDCD1,CNN3,NRP1,PMAIP1,SLC24A3,CXCL10,TEX15,PTPN3,AXL,PPAP2A,CYP2D22,DSCAM,EXPH5,PLEK,NCF1,PVRL1,RAPGEF5,KIF5C,GPR55,TMEM67,GNA14,ZCCHC18,PEX11A,TNFSF11,LAG3,PDE4A,ALDOC,FBXW8,PLAGL1,SPNB1,ST14,PCTP,JDP2,ST3GAL5,B3GALT5,PCGF2,GDPD5,PLAUR,SLC30A4,RAPH1,PLXNA3,RASSF4,PEAR1,SLC39A4,CD8A,RGMB,BICD1,HIF3A,ITGB1,RYK,UGCG,PLA2G4F,SCAMP1,PTAFR,ATP1B1,IL1RL1,ADCK3,GCNT1,EEA1,CCRL2,KIF13A,IL13RA1,EPCAM,MID2,WEE1,DUSP16,ACSBG1,LTBP4,KCNK6,WDR35,IFT122,SLC9A7,PDCD1LG2,RASGEF1A,FABP5,CASP4,CD22,LAMA5,HAP1,DAPL1,PADI3,ALS2CR4,HPSE,SPOCK2,SIX5,CD38,ABCB1A,HIVEP3,NAIP7,RAB39B,NRN1,PRNP,DOCK4,RGS16,TMEM176A,DGAT2,CD79B,NACC2,IFT81,CACNA1S,PLCL1,DAPK1,IFNG,ARHGEF5,DIAP3,HGF,PLXNB1,PGLYRP1,AHR,APLP1,MAD1L1,AFP,TWSG1,WLS,LYN,NRBP2,NLGN2,TRIB2,EGR2,KLF7,MYL6B,Foxp3,FUT10,LBP,RABL5,SLC16A3,NLRP1B,CCR8,CYSLTR2,BAIAP3,ENPP5,TRPM1,SWAP70,CHDH,ART3,HDAC11,BCL2A1A,MATN2,CSF1,SCG5,IL6RA,PASK,TBC1D4,CAMK2A,C3,GGT5,INSRR,NRP2,IL9R,RLN3,CD40LG,FCRL1,GRM6,TICAM2,CD83,AMOT,SERPINA3G,MAPRE3,SLC22A2,PIP5K1B,PLEKHG3,PGCP,SLC22A17,TRIM16,NPNT,AHI1,RAD51L1,TMEM146,RYR2,RNF43,RUNX2,ASB2

---------- Post updated at 02:07 PM ---------- Previous update was at 12:44 PM ----------

I figured out the issue.

Here is the change that cleared my issue

awk     'NF>n           {n=NF}
                        {for (i=1;i<=NF;i++) TMP[NR,i]=$i}
         END            {for (i=1;i<=n;i++)
                           {for (j=1; j<=NR; j++) printf "%s\t", TMP [j,i]
                            printf "\n"
                           }
                        }
        ' FS="[ ,]" input

The issue is that you have tabs embedded, as well, instead of only spaces

Change RudiC's code from FS="[ ,]" to FS="[ \t,]"

Perhaps an alternative

awk -F"[ \t,]" '{for (i=1; i<=NF; i++){a != "" ? a= a (" "$i):a=$i}} END {for (k in a) print a[k]}'

Check and see if any of that will help