awk '{for (i=1; i<=LnCnt; i++) if ($1 == Ln) break; if (i > LnCnt) Ln[++LnCnt]=$1}
{for (k=3; k<=NF; k++) {for (j=1; j<=HdCnt; j++) if ($k == Hd[j]) break
if (j > HdCnt) Hd[++HdCnt]=$k
Mx[$1,$k] = $2}
}
END { printf "%10s", ""
for (j=1; j<=HdCnt; j++) printf "%3s", Hd[j]
printf "\n";
for (i=1; i<=LnCnt; i++) {printf "%10s", Ln;
for (j=1; j<=HdCnt; j++) printf "%3s", Mx[Ln, Hd[j]];
printf "\n"
}
}
' FS="[ :,]" file
o p q r s t y u v w x z
site1 A A A A A A C C T T - -
site2 - A G A G A C C A A - A
site3 A - A - A A C A T T T A
gawk -F'[ :,]' '{PROCINFO["sorted_in"]="@ind_str_asc"; A[$1]=1;for (i=3;i<=NF;i++) {C[$i]=$i;B[$1,$i]=$2}}END{printf "SITE\t";for ( jj in C ) printf jj" ";print "" ; for ( ii in A ) {printf ii"\t";for ( jj in C ) {printf B[ii,jj]" "}; print ""}}' infile
Thank you both!
Amazing scripts!
disedorgue, Your script is what i was thinking to use, but need more digestion.
More challenging with real data, that some members (i.e. columns) of some site are with missing data as indicated with "-" in previous post, but not listed in the infile, which will be assigned with an empty cell and will cause confusion.
yifangt,
The Thanks button is still there. Is it possible that you just didn't notice it because disedorgue's code was presented as a single long line and you didn't scroll far enough to the right to see it?
Yes, I was aware of the long single line. Something weird with this thread: 1) When I first posted, nothing was there, so that I re-posted, and have 3 duplicates, embarrassing:D!
2)Looking for the Thank you:b:button, but it was not there;
3) Refreshed the page, all came out. But, that was not always the case. I had thought the site is under maintenance, (forgetting the admins of the site are all Expert:p!) Probably my side e.g. browser/coockies etc:confused:.
Thank you anyway!