I'm kinda stuck on this one, I have 7 files with 30.000 lines/file like this
050 0.023 0.504336
050 0.024 0.529521
050 0.025 0.538908
050 0.026 0.537035
I want to find the mean line by line of the third column from the files named like this:
Stat-f-1.dat .... Stat-f-7.dat
Stat-s-1.dat .... Stat-s-7.dat
Calculate the mean of third column in awk:
awk '{x+=$3;next}END{print x/NR}' data
Now, do you want the mean from values of each files separately?
Or all Stat-f* files together, then all Stat-s*...?
I wanted the average line by line, 1st line of all files print average in a new file, 2nd line, and so on...
not tested....
awk '{x[FNR]+=$3;next}END{for(i=1;i in x;i++) print x/ARGC}' allMyFilesWildWildCard
1 Like
file 1
001 0.046 0.667267
001 0.047 0.672028
001 0.048 0.656025
001 0.049 0.660557
002 0.000 0.669553
002 0.001 0.594648
002 0.002 0.586738
002 0.003 0.593728
002 0.004 0.593658
File 2
001 0.046 0.654565
001 0.047 0.665057
001 0.048 0.660074
001 0.049 0.670424
002 0.000 0.669462
002 0.001 0.594793
002 0.002 0.589329
002 0.003 0.593949
002 0.004 0.592371
They are seven (7) files
Desired Output
001 0.046 Average of the nth line from the two files
001 0.047 Average of the nth line from the two files
001 0.048 Average of the nth line from the two files
001 0.049 Average of the nth line from the two files
002 0.000 Average of the nth line from the two files
002 0.001 Average of the nth line from the two files
002 0.002 Average of the nth line from the two files
002 0.003 Average of the nth line from the two files
002 0.004 Average of the nth line from the two files
Thanks for your help!! Can you try on those files?
The filenames are like Stat-zz.dat, zz goes from 00 to 07
ariasfco:
file 1
001 0.046 0.667267
001 0.047 0.672028
001 0.048 0.656025
001 0.049 0.660557
002 0.000 0.669553
002 0.001 0.594648
002 0.002 0.586738
002 0.003 0.593728
002 0.004 0.593658
File 2
001 0.046 0.654565
001 0.047 0.665057
001 0.048 0.660074
001 0.049 0.670424
002 0.000 0.669462
002 0.001 0.594793
002 0.002 0.589329
002 0.003 0.593949
002 0.004 0.592371
They are seven (7) files
Desired Output
001 0.046 Average of the nth line from the two files
001 0.047 Average of the nth line from the two files
001 0.048 Average of the nth line from the two files
001 0.049 Average of the nth line from the two files
002 0.000 Average of the nth line from the two files
002 0.001 Average of the nth line from the two files
002 0.002 Average of the nth line from the two files
002 0.003 Average of the nth line from the two files
002 0.004 Average of the nth line from the two files
Thanks for your help!! Can you try on those files?
The filenames are like Stat-zz.dat, zz goes from 00 to 07
I'll let you do the honors and let us know how it goes!
Good luck!
mirni
May 26, 2011, 6:13pm
7
@vgersh99 : you need to divide by (ARGC-1)
@AriasFco : try this:
$ awk '{
avg[$1,$2]+=$3;
i[FNR]=$1 SUBSEP $2
}
END{
for(k in i) sorted_i[j++]=k+0;
n=asort(i,sorted_i);
for(j=1; j<=n; j++)
print i[j]"\t" avg[i[j]]/(ARGC-2)
}' SUBSEP='\t' Stat-f-*.dat
2 Likes
ok, a little change:
awk '{x[FNR]+=$3;next}END{for(i=1;i in x;i++) print x/(ARGC-1)}' Stat-0[0-7].dat
yep, just noted that myself - thanks for the catch!