Averaging segments

kylle345 · September 15, 2009, 11:22am

Hi,

I have a file that I want to average. So specifically I want to average every third column for each row.

Here is an example of my file

2 2 2 3 3 3 1 1 1 5 5 5

Heres what I want it to look like after averaging every third column

2 3 1 5

thanks

vgersh99 · September 15, 2009, 11:56am

What have you tried so far and where exactly are you stuck?

jim_mcnamara · September 15, 2009, 11:58am

awk '{one=($1 +$2 +$3)/3 
        two=($4 + $5  +$6)/3 
        three=($7 +$8 +$9)/3
        print one, two, three } ' inputfile

kylle345 · September 15, 2009, 12:05pm

Hey yeah that works and I had somethign similar but the file is rows are very long (like almost 1000 characters). So I was wondering if there was a faster way of doing it (rather than typing $1, $2 etc.)

thanks

clx · September 15, 2009, 12:05pm

if number of fields are not fixed, try:

awk 'NF%3 == 0{for(i=1;i<=NF;i++) { avg=($i+$(i+1)+$(i+2))/3; print avg;i=i+2 }}' file

kylle345 · September 15, 2009, 12:23pm

yeah for some reason that does not work.

clx · September 15, 2009, 12:26pm

if u have number of fields not multiple of 3, it wont process any further.
if you still want to, remove the NF%3 ==0 part in the command.

kylle345 · September 15, 2009, 2:38pm

yeah it works not but everything gets printed into one row. How do I split it up into columns.

---------- Post updated at 02:38 PM ---------- Previous update was at 02:26 PM ----------

Hey maybe I should clarify myself.

So basically there are many rows so I want to the bin average for multiple rows

ripat · September 15, 2009, 4:42pm

Hi,

How about this:

awk '{for(i=3;i<=NF;i+=3) printf "%s ", ($(i-2)+$(i-1)+$i)/3}' file

kylle345 · September 15, 2009, 8:42pm

hey that works. Now I need to scale up to average over a range of 40 numbers

 awk '{for(i=40;i<=NF;i+=40) printf "%s ", ($(i-39)+$(i-38)$+(i-37)+$(i-36)+$(i-35)+$(i-34)+$(i-33)+$(i-32)+$(i-31)+$(i-30)+$(i-29)+$(i-28)+$(i-27)+$(i-26)+$(i-25)+$(i-24)+$(i-23)+$(i-22)+$(i-21)+$(i-20)+$(i-19)+$(i-18)+$(i-17)+$(i-16)+$(i-15)+$(i-14)+$(i-13)+$(i-12)+$(i-11)+$(i-10)+$(i-9)+$(i-8)+$(i-7)+$(i-6)+$(i-5)+$(i-4)+$(i-3)+$(i-2)+$(i-1)+$i)/40}' file1.txt > file2.txt

i put this in but there is something wrong. hope someone can help. Basically if i have a file with 40 number 1's, the average is not 1 but 5.95.

thanks

ripat · September 16, 2009, 1:34am

For longer range it is easier to iterate through the fields to calculate the sum. And also check if you really have *exactly* 40 1's in your test file.

awk -v r=40 '{for(i=r;i<=NF;i+=r){for(j=0;j<r;j++){sum+=$(i-j)}printf "%s ", sum/r;sum=0}}' file

---------- Post updated at 07:34 AM ---------- Previous update was at 07:21 AM ----------

Or in a more readable way if you prefer (I do!)

awk -v r=40 '{
    for(i=r;i<=NF;i+=r){
	for(j=0;j<r;j++){
	    sum+=$(i-j)
	}
	printf "%s ", sum/r
	sum=0
    }
}' file

kylle345 · September 16, 2009, 12:04pm

hey thanks it works

but for some reason it does not treat each row independently

eg.

 1 1 1 1 1 1 1 1 1 1 1 1 ... n=40
 1 1 1 1 1 1 1 1 1 1 1 1 ... n=40
 1 1 1 1 1 1 1 1 1 1 1 1 ... n=40

The output will look like this

1 1 1

rather than

1
1
1