kayak
1
I have a file that looks that this:
820 890 530
1650 1600 1800
1850 1900 2270
1640 2300 1670
2080 2200 2350
1150 1630 2210
I would like to output the mean and standard deviation of each row so that my final output would look like this
820 890 530 746.667 155.849
1650 1600 1800 1683.33 84.9837
1850 1900 2270 2006.67 187.32
1640 2300 1670 1870 304.302
2080 2200 2350 2210 110.454
1150 1630 2210 1663.33 433.385
mean is calculated as average and standard dev is:
sqrt((sum((x-mean)**2))/N)
In my case N=3. Can someone help with awk
PikK45
2
Calculating mean is easier
awk '{sum=0; for (i=1; i<=NF; i++) {sum=sum+$i;} m=sum/NF; print $0, m; }' file
Yoda
3
awk ' BEGIN {
N = 3
} {
for ( i = 1; i <= NF; i++ )
{
rec[NR] = $0
sum[NR] += $i
sumsq[NR] += $i * $i
}
} END {
for ( i = 1; i <= NR; i++ )
print rec" "sum/N" "sqrt(sumsq/N - (sum/N)**2)
} ' file
PikK45
4
Why can't we use NF instead of N here??
Yoda
5
Yes, we can use NF or 3. Does that really matter?
I defined & used variable N
because the requester specified it in his formula.
PikK45
6
I just wanted to use NF so that we can use the same logic to many columns
And, your awk codes work perfect Started learning from you