Is it possible to print the average of 2 nd column based on a key in 1st col
input
a1X 4
a1X 6
a2_1 10
a2_1 20
a2_1 30
a2_1 30
a2_1 10
output
a1X 5
a2_1 20
Is it possible to print the average of 2 nd column based on a key in 1st col
input
a1X 4
a1X 6
a2_1 10
a2_1 20
a2_1 30
a2_1 30
a2_1 10
output
a1X 5
a2_1 20
Try this:
awk '
{
sum[$1] += $2;
count[$1]++;
}
END {
for( x in sum )
printf( "%s %.2f\n", x, sum[x]/count[x] );
}
' input-file
Thanx agama. Is it possible to print the max value ?
a1X 6
a2_1 30
Anything is possible, well almost
awk '
{
sum[$1] += $2;
count[$1]++;
if( max[$1] < $2 )
max[$1] = $2;
}
END {
for( x in sum )
printf( "%s ave=%.2f max=%d\n", x, sum[x]/count[x], max[x] );
}
'
Prints both average and max.
---------- Post updated at 22:27 ---------- Previous update was at 22:24 ----------
Replace the printf() with this if you only want max:
printf( "%s %d\n", x, max[x] );
I am getting 0's as a output ? I forgot to tell you I may have negative numbers. Sorry.
Ex:
G1 -1.093384748
G1 -0.737460373
TB1 1.130494838
TB1 1.180494838
Which is your OS?
--ahamed
macosX
Can you paste the inputfile and the exact output you are getting?
--ahamed
---------- Post updated at 11:03 PM ---------- Previous update was at 11:01 PM ----------
Try this...
awk ' {
val=$2+0
sum[$1] += val;
count[$1]++;
!max[$1]?max[$1]=val:NULL
if( max[$1] < val )
max[$1] = val
}
END {
for( x in sum )
printf( "%s ave=%.2f max=%.2f\n", x, sum[x]/count[x], max[x] );
}
' input_file
--ahamed
Yes it is working great. Thanx!!!! One more thing is it possible to modify the script to select highest value if it is positive and lowest if it is negative ?
highest and lowest for each group?
--ahamed
input
a 1
a 2
b -1
b -2
output
a 2
b -2
Try this...
awk ' {
val=$2+0
sum[$1] += val;
count[$1]++;
!max[$1]?max[$1]=val:NULL
val>0?(max[$1]<val?max[$1]=val:NULL):(max[$1]>val?max[$1]=val:NULL)
}
END {
for( x in sum )
printf( "%s ave=%.2f max_min=%.2f\n", x, sum[x]/count[x], max[x] );
}
' input_file
--ahamed