AWK : Add Fields of lines with matching field

DerSeb · January 16, 2011, 12:20pm

Dear All,

I would like to add values of a field, if the lines match in a certain field. Then I would like to divide the sum though the number of lines that have a matched field. This is the Input:

Input:

Test1 5
Test1 10
Test2 2
Test2 5
Test2 13
Test3 4

Output:

Test1 7.5
Test1 7.5
Test2 6.667
Test2 6.667
Test2 6.667
Test3 4

Any help is much appreaciated!

m.d.ludwig · January 16, 2011, 12:29pm

The awk script:

{ l[NR] = $1; n[$1]++; s[$1] += $2; }
END {
  for (i in n) { a = s / n; }
  for (i = 1; i <= NR; i++) { print l, a[l]; }
}

Results in:

Test1 7.5
Test1 7.5
Test2 6.66667
Test2 6.66667
Test2 6.66667
Test3 4

You can adjust the print to get the numeric precision you need.

pravin27 · January 16, 2011, 12:51pm

Try this,

awk '{if(! a[$1]) {a[$1]=$2;j=0;b[$1]=++j}else{a[$1]=a[$1]+$2;b[$1]=++j}} END{for (i in a) {for(l=1;l<=b;l++){print i,a/b}}}' inputfile

m.d.ludwig · January 16, 2011, 1:09pm

DerSeb -- will the input data be ordered? Or is something like:

Test1 5
Test2 13
Test2 2
Test3 4
Test1 10
Test2 5

possible?

Scrutinizer · January 16, 2011, 2:22pm

awk 'NR==FNR{A[$1]++;B[$1]+=$2;next}{$2=B[$1]/A[$1]}1' infile infile

DerSeb · January 16, 2011, 4:41pm

Wow, you were all really fast.

yes, the file is sorted.

Thx all, the scripts work great!

Chubler_XL · January 16, 2011, 8:41pm

As the file is already sorted you can do it in 1 pass:

awk 'function p(){for(I=C;I;I--)print R" "T/C} $1!=R{C=T=p()} {R=$1;C++;T+=$2} END{p()}' infile