awk merge matching columns

I know I'm not the first one asking this but my code still does not work:
File 1:

gi|1283| tRNAscan exon 87020 88058 . - . transcript_id "Parent=tRNA-Tyr5.r01";
gi|3283| tRNAscan exon 97020 97058 . + . transcript_id "Parent=tRNA-Tyr6.r01";
gi|4283| rRNAscan exon 197020 197058 . - . transcript_id "Parent=rRNA-Tyr1.r01";
gi|5283| mRNAscan exon 295020 298059 . + . transcript_id "Parent=mRNA-Tyr2.r01";

This file is tab separated
File 2:

"Parent=tRNA-Tyr6.r01"; 12
"Parent=mRNA-Tyr2.r01"; 0

This file is also tab separated
desired Output:

"Parent=tRNA-Tyr6.r01"; 12 -
"Parent=mRNA-Tyr2.r01"; 0 +

I want to merge these two files based on column $10 in file 1 (" Parent=tRNA-Tyr6.r01 ") and column $1 in file 2 (" Parent=tRNA-Tyr6.r01 "), appending column $7 from file 1 (-/+)
MY solution would go like this:

awk 'FNR==NR{a[$10]=$7;next} ($1 in a) {print $1,"2,a[$1]}' file2 file1 > Output

can anyone help me out?
best Regards
Mo

Hi Mo, the desired output you have put does not match what you have specified, as

Parent=tRNA-Tyr6.r01

has a + in file1 but your desired output shows a -
Also you have put the last field from file2 in the desired output, but this is not in your specification.

Anyway, if its any help the following code

awk 'NR==FNR{a[$1];next} $10 in a {print $10,$7}' file2 file1

Will give an output of

"Parent=tRNA-Tyr6.r01"; +
"Parent=mRNA-Tyr2.r01"; +
1 Like

Thank you for the fast reply!

This was just a copy/paste error!

Is there a simple solution to add the 2nd column of file 2 to the output-file?

If you did want the last field from file2 you could do the following:

awk 'NR==FNR{a[$1]=$2;next} $10 in a {print $10, a[$10],$7}' file2 file1

This would give output as

"Parent=tRNA-Tyr6.r01"; 12 +
"Parent=mRNA-Tyr2.r01"; 0 +

If you find any of these useful, please hit the thanks button :slight_smile:

1 Like

TY very much

Maybe the other way round:

awk 'NR == FNR {T[$10] = $7; next} {print $0, T[$1]}' file1 file2
"Parent=tRNA-Tyr6.r01"; 12 +
"Parent=mRNA-Tyr2.r01"; 0 +
1 Like