awk to update file with sum of matching fields in another file

In the awk below I am trying to add a penalty to a score to each matching $1 in file2 based on the sum of $3+$4 (variable TL) from file1 . Then the $4 value in file1 is divided by TL and multiplied by 100 (this valvue is variable S). Finally, $2 in file2 - S gives the updated $2 result in file2 . Since math is not my strong suit there probaly is a better way of doing this, but this is what I could think off, however it returnsa syntax error (FILENAME=file2 FNR=1) fatal: attempt to use array TL' in a scalar context Thank you :).

file1 space delimited

ACP5 4 1058 0
ACTB5 10 1708 79
ORAI1 2 952 0
TBX1 9 1932 300

file2 tab delimited

ACP5     100.00
ACTB     100.00
ORAI1    94.01
TBX1     77.23

desired output tab-delimited the --- is an example calculation and not part of the output

ACP5    100.00
ACTB    89.59   ---- $3+$4=1787 this is TL (comes from file1), $4/TL*100 is 4.42, $2 in file2 is 100 - 4.42 = 95.58 ----
ORAI1   94.01
TBX1    63.79

awk

awk '
FNR==NR{  # process each line
  TL[$1]=($3+$4);next} ($1 in TL)  # from file1 store sum of $3 and $4 in TL
    {S=(P[$4]/TL)*100;printf("%s\t %.2f\n",$1,  $2-S)  # store $4/TL from file1 in S and subtract S from $2 in file2, output two decimal places
      }1' OFS="\t" file1 FS="\t" file2  # update and define input

Few issues with your code:

  1. P[] is not assigned.
  2. Poorly formatted line with comment between condition and { action }
  3. T[] referenced without an index
  4. P[] referenced with wrong field for index
  5. next required to avoid output twice for 1 record

try:

awk '
FNR==NR{  # process each line
  P[$1]=$4;TL[$1]=($3+$4);next} 
# from file1 store sum of $3 and $4 in TL
($1 in TL) {
S=(P[$1]/TL[$1])*100;printf("%s\t%.2f\n",$1,  $2-S);next  # store $4/TL from file1 in S and subtract S from $2 in file2, output two decimal places
      }1' OFS="\t" file1 FS="\t" file2  # update and define input

or

awk '
# from file1 store sum of $3 and $4 in A and B
FNR==NR{
  A[$1]=$3; B[$1]=$4; next
}
($1 in A) {
  # Recalculate field #2 to two decimal places from the file1 fields stored above
  $2=sprintf("%.2f", $2 - B[$1] / (A[$1] + B[$1]) * 100)
}1' OFS='\t' file1 FS='\t' file2
1 Like

Thank you very much :).