Thanks! vgersh for your post, its working fine but it is not displaying the new headings of the columns which are the values of second col of the input file.
I'll be grateful to you if you can explain " a[$1]=($1 in a)?a[$1] OFS $3:$3 " this part of the code. Thanks again.
adjust the 'w=15' (width of the column) to your liking/needs.
nawk -v w=15 '{a[$1]=($1 in a)?a[$1] OFS sprintf("%-*s",w,$3):sprintf("%-*s",w,$3);h[$2]}END{for(i in h) printf("\t%-*s", w,i);print "";for(i in a) print i,a}' OFS='\t' myFile
checks whether first field ($1) is present in array 'a'; if yes, it appends the third column to it; if not, assign third column.
It's a ternary operator, just like in C, combined with assignment:
Thanks a lot mirni for such a nice explanation. If I have understood clearly then first it checks if $1 is there or not, in a way to check when it reaches end of file. But, I am not clear what is the purpose of assigning third column to $1?
and also its not clear to me which part of the code actually checks when the value of $1 changes?
I'll really thankful if you or vgersh can explain this.
This is a filter, so it processes one line at a time. Let's take the first line of input:
Abc-01 pc1 -0.69
We make an assignment
a[$1]=$3; e.i. a["Abc-01"]=-0.69
Next line of input:
Abc-01 E2cR 0.459666666666667
Is $1 in array a? Yes, we do have a["Abc-01"] defined. Append 3rd column. Now we have
a["Abc-01"]=-0.69 0.459666666666667
Next line of input is gonna make a["Abc-01"] grow into '-0.69\t0.459666666666667\t1.2265625', which is the desired output.
Next line will set array element a["Xyz-01"], etc.
You are constructing array 'a' so that the elements of it are your desired output; each array element has one line of output stored.
After you have all lines processed, you have your array 'a' populated, and you are ready to print it; easy loop (although in random order!) will print what's stored in 'a'.