Hi,
I have a file that has 201 columns (1 for the name of each row and the other 200 are values)
I want to print the average for each column
the file looks like this (as an example this only has 7 columns with values)
1 2 3 4 5 6 7
abr 5 6 7 1 2 4 5
hhr 2 1 3 4 2 1 2
iip 1 3 1 1 5 3 2
I want to just print the average of each column while ignoring the first column and first row.
4 5 5.5 3 4.5 4 4.5
thanks
$
$
$ cat data.txt
1 2 3 4 5 6 7
abr 5 6 7 1 2 4 5
hhr 2 1 3 4 2 1 2
iip 1 3 1 1 5 3 2
$
$
$ perl -lane 'if ($. > 1) {
$n++;
foreach $i (1..$#F) { push @{$x{$i}}, $F[$i] }}
END {
foreach $k (sort keys %x) {
$sum += $_ foreach @{$x{$k}};
$str .= sprintf("%.4f ", $sum/$n);
$sum = 0;
}
print $str}
' data.txt
2.6667 3.3333 3.6667 2.0000 3.0000 2.6667 3.0000
$
$
$
tyler_durden
sulti
3
In awk:
lines=`cat data.txt | wc -l` #corrected, thanks to itkamaraj
lines=$((lines-1))
for i in $(seq 2 201); do
awk -v c=$i -v l=$lines 'NR>1 {s+=$c}; END {printf ("%f ", s/l)}' data.txt
done
echo
and maybe even simplier without calculating lines before
for i in $(seq 2 201); do
awk -v c=$i 'NR>1 {s+=$c;l+=1}; END {printf ("%f ", s/l)}' data.txt
done
echo
??????????
lines=`echo data.txt | wc -l`
sulti
5
Sorry, my bad didn't drink my morning coffee
I meant
lines=`cat data.txt | wc -l`
but then I've added simplier solution.
And one more thing, you dont want to use the cat command here.
this is called as useless use of cat http://partmaps.org/era/unix/award.html
wc -l < data.txt
or
wc -l data.txt
sulti
7
I have to disagree
$ wc -l test
4 test
$ cat test | wc -l
4
That's why I used cat.
$ a=`wc -l test`
$ [ $a -gt 0 ] && echo Works
bash: [: Too many arguments
edit:
Ok, I noticed that wc -l < data.txt
actualy would work. But wc -l data.txt
would not.
how about
grep -c . data.txt # but this will omit empty lines
nawk 'END{print NR}' data.txt
sulti
9
Ok, this gives me another idea. Instead of counting lines I could use NR:
for i in $(seq 2 201); do
awk -v c=$i 'NR>1 {s+=$c}; END {printf ("%f ", s/(NR-1))}' data.txt
done
echo