AWK: how to get average based on certain column

shell123 · March 24, 2011, 4:55am

Hi,

I'm new to shell programming, can anyone help me on this? I want to do following operations -

Average salary for each country
Total salary for each city

and data that looks like -

salary country city
10000 zzz BN
25000 zzz BN
30000 zzz BN
10000 yyy ZN
15000 yyy ZN

My below code only gives average and sum of salaries in total for all records.

average) echo "Average salary for each country"
awk '{ s += $1; } END { print "sum: ", s, " average: ", s/NR, "no of records: ", NR }' $FILENAME
;;
totalsal) echo "Total salary for each city"
awk '{ s += $1; } END { print "sum of salaries : ", s }' $FILENAME
;;

output -

sum: 90000 average: 18000 no of records: 5
sum of salaries : 90000

i want this to be done based on column value i.e. -

for country as zzz -
sum: 65000 average: 21666.66 no of records: 3

for country as yyy -
sum: 25000 average: 12500 no of records: 2

Similarly for total salaries based on each City.

Is there any method in awk to make the search based on the value of certain column?

ctsgnb · March 24, 2011, 5:48am

awk 'NR>1{A[$2]+=$1;I[$2]++}END{for(i in A) if (A) print "Country",i,"sum",A,"Average",A/I,"records",I}' infile

# cat tst
salary country city
10000 zzz BN
25000 zzz BN
30000 zzz BN
10000 yyy ZN
15000 yyy ZN

# awk 'NR>1{A[$2]+=$1;I[$2]++}END{for(i in A) if (A) print "Country",i,"sum",A,"Average",A/I,"records",I}' tst
Country yyy sum 25000 Average 12500 records 2
Country zzz sum 65000 Average 21666.7 records 3
#

bartus11 · March 24, 2011, 5:49am

awk '{a[$2]+=$1;b[$2]++}END{print "country\tsum\tavg\tnum";for (i in a){print i"\t"a"\t"a/b"\t"b}}' file

shell123 · March 24, 2011, 6:16am

Thanks