AWK novice - calculate the average

Hi,
I have the following data in a file for example:

P1 XXXXXXX.1 YYYYYYY.1 ZZZ.1
P1 XXXXXXX.2 YYYYYYY.2 ZZZ.2
P1 XXXXXXX.3 YYYYYYY.3 ZZZ.3
P1 XXXXXXX.4 YYYYYYY.4 ZZZ.4
P1 XXXXXXX.5 YYYYYYY.5 ZZZ.5
P1 XXXXXXX.6 YYYYYYY.6 ZZZ.6
P1 XXXXXXX.7 YYYYYYY.7 ZZZ.7
P1 XXXXXXX.8 YYYYYYY.8 ZZZ.8
P2 XXXXXXX.1 YYYYYYY.8 ZZZ.1
P2 XXXXXXX.2 YYYYYYY.7 ZZZ.2
P2 XXXXXXX.3 YYYYYYY.2 ZZZ.3
P2 XXXXXXX.4 YYYYYYY.9 ZZZ.4
P2 XXXXXXX.5 YYYYYYY.0 ZZZ.5
P2 XXXXXXX.6 YYYYYYY.7 ZZZ.6
P2 XXXXXXX.7 YYYYYYY.2 ZZZ.7
P2 XXXXXXX.8 YYYYYYY.9 ZZZ.8
P3 XXXXXXX.1 YYYYYYY.8 ZZZ.1
P3 XXXXXXX.2 YYYYYYY.7 ZZZ.2
P3 XXXXXXX.3 YYYYYYY.2 ZZZ.3
P3 XXXXXXX.4 YYYYYYY.9 ZZZ.4
P3 XXXXXXX.5 YYYYYYY.0 ZZZ.5
P4 XXXXXXX.1 YYYYYYY.8 ZZZ.1
P4 XXXXXXX.2 YYYYYYY.7 ZZZ.2
P4 XXXXXXX.3 YYYYYYY.2 ZZZ.3
P4 XXXXXXX.4 YYYYYYY.9 ZZZ.4
P4 XXXXXXX.5 YYYYYYY.0 ZZZ.5
P4 XXXXXXX.6 YYYYYYY.7 ZZZ.6
P4 XXXXXXX.7 YYYYYYY.2 ZZZ.7
P4 XXXXXXX.8 YYYYYYY.9 ZZZ.8

Can you help me with an awk script in order to be able to calculate the coordinate's average for each point? Each point has maximum 8 measurements but can have less. Also the points can be duplicated and some coordinates might have zero values.
If it's possible I would like on the final results to know beside the X Y and Z average, the number of measurements used and if the zero values were used in calculating the average.

Thanks

Post your desire output.

1 Like

The output should be

P1 XXXXXXX.Average YYYYYYY.Average ZZZ.Average "number of measurements for P1" "number of measurements with zero value for P1" 
P2 XXXXXXX.Average YYYYYYY.Average ZZZ.Average "number of measurements for P2" "number of measurements with zero value for P2" 

Many Thanks

Try this,

 awk -F"[. ]" 'NR>1 && $1 != p {print p,s"."a/k,x"."b/k,y"."c/k,"number of measurements for",p,"-",k," and number of measurements with zero value for",p," - " z; k=0;a=0;b=0;c=0;z=0}
{if (/\.0/) {++z};a+=$3;b+=$5;c+=$7;++k;p=$1;s=$2;x=$4;y=$6} END {print p,s"."a/k,x"."b/k,y"."c/k,"number of measurements for",p,"-",k," and number of measurements with zero value for",p," - " z}' inputfile
1 Like
awk '{n[$1]++;a[$1]+=$2;b[$1]+=$3;c[$1]+=$4;for (i=2;i<=4;i++){if ($i==0)z[$1]++}}END{for (i in a){print i,a/n,b/n,c/n,"num of 0 val for "i":",(z[$1])?z:0}}' file
1 Like

I can't get result from your command:

awk '{n[$1]++;a[$1]+=$2;b[$1]+=$3;c[$1]+=$4;for (i=2;i<=4;i++){if ($i==0)z[$1]++}}END{for (i in a){print i,a/n,b/n,c/n,"num of 0 val for "i":",(z[$1])?z:0}}' infile
P2 0 0 0 num of 0 val for P2: 0
P3 0 0 0 num of 0 val for P3: 0
P4 0 0 0 num of 0 val for P4: 0
P1 0 0 0 num of 0 val for P1: 0

I update from your code, it should be right now.

awk -F"[ .]" '
 {n[$1]++;a[$1]+=$3;b[$1]+=$5;c[$1]+=$7;for (i=3;i<=7;i=i+2){if ($i==0)z[$1]++}}
 END{for (i in a){print i,$2,a/n,$4,b/n,$6,c/n,n,(z)?z:"0" |"sort -n"}}
 ' infile

P1 XXXXXXX 4.5 YYYYYYY 4.5 ZZZ 4.5 8 0
P2 XXXXXXX 4.5 YYYYYYY 5.5 ZZZ 4.5 8 1
P3 XXXXXXX 3 YYYYYYY 5.2 ZZZ 3 5 1
P4 XXXXXXX 4.5 YYYYYYY 5.5 ZZZ 4.5 8 1
1 Like

Thank you all for all the info's sent.