Calculating the average of scores

Hi
I have 2 files
file1

aac 23 25
aac 87 90
aac 33 67

file2

23 0.9
24 0.8
25 0.4
........
67 0.55
........

I want to get output as

aac 23 25 0.7  i,e (0.9+0.8+0.4)/3

like this for all in file1

How to do that? Please help

If there are no missing values in file 2 you could try:

awk 'NR==FNR{A[$1]=$2; next}{t=0; for(i=$2; i<=$3; i++)t+=A; print $0, t/($3-$2+1)}' file2 file1
1 Like

Unfortunately if i am running this its showing that the process is been killed i tried with the small dataset it gives the desired output.

Are you saying it works correctly for a small file, but does not finish (crashes) with a large file?

Ya exactly dividing the file would be more hectic due do my large data set

I'm not suggesting you divide the file. I'm asking if it crashes on the large file. If yes, how long does it run before crashing? And is there any error message? Copy and paste what's going on so we can see. :smiley:

ya it crashes on large files. It runs around 15 to 20 minutes. It just says killed nothing more

One thing to suggest is add fflush () as follows: t/($3-$2+1)}; fflush ()' file2 file1 so when it crashes, and you are saving the output, you can see where it crashes, or even see if it gets to file1. There is a good chance the output might provide a clue. If your awk does not support fflush, it will quickly let you know.

since its taking huge time and memory i tried to include it in perl and run in my server the script is as follows

use strict;
use Data::Dumper;
use Carp;
use File::Basename;

my $path = "/home/jpsl/";
my $file1 = "2";
my $file2 = "1";

    open PIPE, "| qsub" or die $!;
    print PIPE <<EOF;
#!/bin/sh
#PBS -N Perl
#PBS -l select=4:ncpus=4
#PBS -k oe

awk 'NR==FNR{A[$1]=$2; next}{t=0; for(i=$2; i<=$3; i++)t+=A; print$0, t/($3-$2+1)}' $file1 $file2


EOF
its returning error like this
awk: NR==FNR{A[]=; next}{t=0; for(i=; i<=; i++)t+=A; printtrail.pl, t/(-+1)}
awk:           ^ syntax error
awk: fatal: invalid subscript expression

Running the awk command within perl is not going to speed anything up or use less memory. Is there some reason not to put in shell script? Sorry, I don't know why you are getting the error messages. I don't normally use perl. Are you sure you are invoking the awk external command correctly from within perl? Is your perl script called trail.pl by any chance? :wink:

I wanted to run it in server through pbs so i included it in perl. ya my perl script is trail.pl

Aha!! Look at your error message. perl is translating print$0 to printtrail.pl result. All those $0 and $1 etc. are being interpreted by perl, not by awk. There is something wrong with the way you are invoking the awk from within perl.