Is there an awk script that can easily perform the following operation?
I have a data file that is in the format of
1944-12,5.6
1945-01,9.8
1945-02,6.7
1945-03,9.3
1945-04,5.9
1945-05,0.7
1945-06,0.0
1945-07,0.0
1945-08,0.0
1945-09,0.0
1945-10,0.2
1945-11,10.5
1945-12,22.3
1946-01,35.2
1946-02,13.4
I need to find the average of the values contained within -01, -02, and -03.
For instance the average of
1945-01,9.8
1945-02,6.7
1945-03,9.3
Would output
1945-03, 8.6
Any help would be appreciative.
Thanks!
CarloM
2
I'm not clear on whether your identifying column is '1945-05', '1945', or just '05'?
You could do something like:
awk -F, '{totals[$1]=+$2;counts[$1]++} END {for (i in totals) { print i, totals/counts}}' file
(1945-05)
Or:
awk -F"-|," '{totals[$2]=+$3;counts[$2]++} END {for (i in totals) { print i, totals/counts}}' file
(05 - $1 instead of $2 for 1945)
zaxxon
3
I guess the 1944 etc. is the important identifier? Using -1,-2,-3 just to filter the relevant lines:
awk -F"[,-]" '/-0[123],/ {a[$1]+=$NF; c[$1]++} END{for(e in a)print e", "a[e]/c[e]}' infile
1945, 8.6
1946, 24.3
1 Like
Thanks for the help !
zaxxon's script worked great.