Output mean and standard deviation of a row

kayak · March 26, 2013, 10:05am

I have a file that looks that this:

820 890 530
1650 1600 1800
1850 1900 2270
1640 2300 1670
2080 2200 2350
1150 1630 2210

I would like to output the mean and standard deviation of each row so that my final output would look like this

820 890 530 746.667 155.849
1650 1600 1800 1683.33 84.9837
1850 1900 2270 2006.67 187.32
1640 2300 1670 1870 304.302
2080 2200 2350 2210 110.454
1150 1630 2210 1663.33 433.385

mean is calculated as average and standard dev is:

sqrt((sum((x-mean)**2))/N)

In my case N=3. Can someone help with awk

PikK45 · March 26, 2013, 11:05am

Calculating mean is easier

awk '{sum=0; for (i=1; i<=NF; i++) {sum=sum+$i;} m=sum/NF; print $0, m; }' file

Yoda · March 26, 2013, 11:41am

awk ' BEGIN {
        N = 3
} {
        for ( i = 1; i <= NF; i++ )
        {
                rec[NR] = $0
                sum[NR] += $i
                sumsq[NR] += $i * $i
        }
} END {
        for ( i = 1; i <= NR; i++ )
                print rec" "sum/N" "sqrt(sumsq/N - (sum/N)**2)
} ' file

PikK45 · March 26, 2013, 11:53am

Why can't we use NF instead of N here??

Yoda · March 26, 2013, 11:56am

Yes, we can use NF or 3. Does that really matter?

I defined & used variable N because the requester specified it in his formula.

PikK45 · March 26, 2013, 12:00pm

I just wanted to use NF so that we can use the same logic to many columns

And, your awk codes work perfect Started learning from you