After six and a half years as a member and with more than 130 posts, some of which on average handling / calculation, one would presume you have at least some idea of an approach. So - any attempts from your side?
Using the example that RudiC linked to - I know how to get the average of fields 3 and 4, but Im not sure how to include field 14 in the calculation. The below script is not clean but appears to work.
awk '$2==2016 {SUM[$1]+=$14; CNT[$1]++} $2==2017 {SUM[$1]+=$3+$4; CNT[$1]+=2} END {for (s in SUM) print s, SUM/CNT}'
works for a 3 value computation ((2016 field 14 + 2017 field 3 + field 4)/3)
Output
001001 42.07
but once you add more values, the calculation is not correct.
For instance
awk '$2>=2015 {SUM[$1]+=$14; CNT[$1]++} $2<=2018 {SUM[$1]+=$3+$4; CNT[$1]+=2} END {for (s in SUM) print s, SUM/CNT}'
Output
001001 35.2234
I could not replicate the end result in excel. Perhaps its the ordering/sorting? There are additional lines on top of 2015 so that could be an issue. They should not be ignored.
How can that be "not correct"? You didn't specify what to do for field 2 values other than 2016 and 2017, so "ignore" was assumed. With your NEW sample data, the proposal given yields
001001 42.07
exactly what was requested.
With your modified code, several fields will be counted more than once, falsifying the average.
You are correct. The data is sorted by field 1 than field 2.
Can the code be modified to work across multiple fields for instance for values between 2015 and 2018? Much like the script you helped me with here. Seems like you would just divide by the number of fields examined.
For simplicity sake I originally wanted to average the output for 2016 and 2017. So (40.15 + 42.04 + 44.02)/3
I thought I would be able to simply expand it out to cover additional lines. There are hundreds prior to 2018. Say for instance for 2015 through 2018 the output would be something like:
How about describing the problem correctly and entirely from the beginning? Could have saved you and me quite some time.
For your new specification try