awk- reading input file twice

Hello,

I've been trying to come up with a solution for the following problem; I have an input file with two columns and I want to print as an output the first column without any changes but for the second column, I want to divide it by its last value. Example input:

1 9
2 10
3 11
4 12
5 13
6 14

Desired output:

1 9/14
2 10/14
3 11/14
4 12/14
5 13/14
6 14/14

So I don't really know how to read the file once in order to get the last value of the second column and then read it once again in order to print both columns, the second one divided by this last value.

Thank you for the help!

Try this,

 awk -v lst_col=`awk 'END {print $2}' inputfile` '{print $1 FS $2"/"lst_col}' inputfile
awk 'NR==FNR{s=$2;next}{print $1, $2/s}' file file

I'm not too sure how to pass the file as "inputfile" because I'm using a pipe from a previous awk result as an input ...so I'm basically doing a whole bunch of awks and this would be the last one:

 ... | sort | awk '{print $1, p += $2;}'| (this awk) > $file.result  
... | sort | 
awk '{a[++c]=$1; b[c]=$2}END{for(i=1;i<=c;i++)print a, b/b[c]}'

Edit: should be:

... | sort | 
awk '{a[++c]=$1; b[c]+=$2}END{for(i=1;i<=c;i++)print a, b/b[c]}'
1 Like

I'm a bit confused... Is this only for the last awk? or does it include the previous one as well?

awk '{print $1, p += $2}' 

After the sort command this command should be suffice:

... | sort | 
awk '{a[++c]=$1; b[c]+=$2}END{for(i=1;i<=c;i++)print a, b/b[c]}'

Doesn't work like it used to with the other awk before the pipe... it prints the first and second columns without any change.

Can you post the output after the sort command?
Please use code tags.

I�m actually using quite a huge amount of information, the output looks something like this:

0      18767
0.000999928      87183
0.00100017      38975
0.00199986      80390
0.00200009      124310
0.00299978      8133
0.00300002      84539
0.00399995      40803
0.00400019      11242
0.00499988      21893
0.00500011      23798
0.0059998      8121
0.00600004      37201
0.00699997      24423
0.00700021      3676
0.0079999      10935
0.00800014      8433
0.00899982      3501

I'm attaching a file with the whole output, if needed.

This is my output based on the snippet of the output after the last pipe:

$ cat file
0      18767
0.000999928      87183
0.00100017      38975
0.00199986      80390
0.00200009      124310
0.00299978      8133
0.00300002      84539
0.00399995      40803
0.00400019      11242
0.00499988      21893
0.00500011      23798
0.0059998      8121
0.00600004      37201
0.00699997      24423
0.00700021      3676
0.0079999      10935
0.00800014      8433
0.00899982      3501
$
$ cat file | awk '{a[++c]=$1; b[c]+=$2}END{for(i=1;i<=c;i++)print a, b/b[c]}'
0 5.36047			# 18767/3501
0.000999928 24.9023		# 87183/3501
0.00100017 11.1325
0.00199986 22.962
0.00200009 35.507
0.00299978 2.32305
0.00300002 24.1471
0.00399995 11.6547
0.00400019 3.21108
0.00499988 6.25336
0.00500011 6.79749
0.0059998 2.31962
0.00600004 10.6258
0.00699997 6.97601
0.00700021 1.04999
0.0079999 3.12339
0.00800014 2.40874
0.00899982 1
$

Hmm yeah but the point is that the original output after the awk that got deleted

awk '{print $1, p += $2}'
was as follows:

0 18767
0.000999928 105950
0.00100017 144925
0.00199986 225315
0.00200009 349625
0.00299978 357758
0.00300002 442297
0.00399995 483100
0.00400019 494342
0.00499988 516235
0.00500011 540033
0.0059998 548154

and the new awk is not doing this sum properly.... since each value on the second column is an accumulation.

The final step there would be only to divide each value by the last (so the final accumulation) value on the second column, and I can't seem to achieve that.

The output of the second column should be in a range from something to 1.... since well of course the last value would be divided by itself

##--get unique tags
for i in ` cat testfile.txt | awk  '{print $1}'|sort -u`
do
grep $i testfile.txt >temp.txt
cat temp.txt | sort -n |tail -1  >>finaldata.txt
done

Try this one:

awk '{a[++c]=$1; b[c]=b[c-1]+$2;t+=$2}END{for(i=1;i<=c;i++)print a,b/t}' file
1 Like

It's perfect now, thanks Franklin52! :b: