I need to read each column, get the row values from the column next to it that have same name in the first column and compute the average of their values and output in a new file. They are usually triplicates of the same name, i need to get all three of its values and compute the average of these . The output should look like this in a new file:
I print only one of the triplicates and next to it the average of the three values that each of the triplicates had.
Is it possible to be done in awk? I am trying with perl.
With PERL:
I thought of converting the columns into rows, read each line and create a hash having as KEY each Sample name and adding the value of each triplicate every time i find the same name to its VALUE. At the end I will divide the value/3 and output the key->value in the output file.
To keep the records organised and in order I will read the input file again and print key->value from the hash for each line.
Is there an easier way to do this? Because with hashes I will lose the ordering of the columns in the file and it will get a bit messy I'm afraid.
Thank you so much pamu.
It works just great with this file.
Is there any clear reason that you can see that it does not work properly with columns that have spaces in their names?
for example if i try this input:
Sample Cq Sample Cq Sample Cq Sample Cq Sample Cq
1 WB IN 23.45 1 WB IN 23.45 1 CVS IN 23.96 1 CVS IN 23.14 S1 31.37
1 WB IN 23.53 1 WB IN 23.53 1 CVS IN 23.81 1 CVS IN 23.24 S1 31.49
1 WB IP 24.55 1 WB IN 24.55 1 CVS IN 23.86 1 CVS IN 23.24 S1 31.74
1 CVS IN 23.62 1 CVS IN 23.62 1 CVS IP 22.12 1 CVS IP 38.3 S10 31.13
1 CVS IN 23.46 I CVS IN 21.74 1 CVS IP 22.24 1 CVS IP 21.4 S10 31.1
1 CVS IN 21.74 1 CVS IN 23.33 1 CVS IP 22.22 1 CVS IP 21.36 S10 31.29