Calculate average of each of position of contents in a huge file

patrick87 · October 16, 2009, 10:40pm

My input:
>AAA_100
10 20 50 60 10 100 15 10
>AAA_100
20 20 50 60 20 100 15 10
>AAA_100
10 20 50 60 40 100 15 10
>AAA_100
40 20 50 60 10 100 15 10
.
.
.
My Output
20 20 50 60 20 100 15 10

If I have a long list of file. I want to calculate average of each position inside the contents one by one.
Anybody got any idea about it?
Thanks for all of the suggestions.

jp2542a · October 16, 2009, 11:23pm

Put this in an file called avg.awk (name not important):

BEGIN{
 count=0
 a[1,8] = 0
}

/AAA/{
        getline
        for(i=1; i<=8; i++)
                a += $i
        count++
}

END{
        for(i=1; i<=8; i++)
                b = a/count
        print b[1], b[2], b[3], b[4], b[5], b[6], b[7], b[8]
}

Then use this command line:

awk -f avg.awk input_file

where input_file is the path to the data file...

danmero · October 17, 2009, 1:24am

awk '/^[0-9]/{for(i=0;++i<=NF;) a=a?(a+$i):$i;c++}END{for(i=0;++i<=NF;) printf ("%s%s",(i==1)?"":FS,a/c)}'  file

Will work for any number of fields.

patrick87 · October 18, 2009, 12:09am

Thanks a lot, danmero
Your code work well for my huge file.
It is fantastic

---------- Post updated at 11:09 PM ---------- Previous update was at 10:08 PM ----------

Hi danmero,
Can I ask you one more question about your code?
Is it your code can't work if the last line of the file is empty,right?
I just found out this problem when I try with a file that last line is empty.
Thanks again.

ripat · October 18, 2009, 5:54am

Another awk solution:

awk '!(NR%2){a="";for(i=1;i<=NF;i++){t+=$i; a=a FS t/(NR/2)}}END{print substr(a,2)}' file

danmero · October 18, 2009, 6:26am

Correct, for different number of fields or empty lines the following code should work

awk '/^[0-9]/{for(i=0;++i<=NF;) a=a?(a+$i):$i;c++;max=(max>NF)?max:NF}END{for(i=0;++i<=max;) printf ("%s%s",(i==1)?"":FS,a/c)}' file

summer_cherry · October 18, 2009, 9:45pm

sed '/^>/d' a.txt | awk '{
        for(i=1;i<=NF;i++)
        _+=$i
        num=NR
        }
        END{
        for(i in _)
        printf("%s ",_/NR)
        }'

danmero · October 18, 2009, 10:12pm

summer_cherry :rolleyes: