awk to match field between two files and use conditions on match

I am trying to look for $2 of file1 (skipping the header) in $2 of file2 (skipping the header) and if they match and the value in $10 is > 30 and $11 is > 49 , then print the line from file1 to a output file. If no match is foung the line is not printed. Both the input and output are tab-delimited . I think either the below awks are close, but after it runs no output results and I am not sure what is wrong. Thank you :slight_smile:

awk 1

awk -F'\t' -v OFS='\t' ' NR==FNR{a[$2]; next} 
($2 in a) && $10>30 && $11>49 ' file1 file2

awk 2

awk 'BEGIN{FS=OFS="\t"} NR==FNR{a[$2]; next} 
($2 in a) && $10>30 && $11>49  ' file1 file2

file1

Missing in IDP but found in Reference:
2   166848646   G   A   exonic  SCN1A   68  13  16;20   0;0 17;15   0;0 0;0 0;0     c.[5139C>T]+[=] 52.94

file2

Chrom    Position    Gene Sym    Target ID    Type    Zygosity    Genotype    Ref    Variant    Var Freq    Qual    Coverage    Ref Cov    Var Cov
chr2    166245425   SCN2A   AMPL5155065355  SNP Het C/T C   T   54  100   50    23  27
chr2    166848646   SCN1A   AMPL1543060606  SNP Het        G/A   G  A   52.9411764706   100 68  32  36

desired output

2   166848646   G   A   exonic  SCN1A   68  13  16;20   0;0 17;15   0;0 0;0 0;0     c.[5139C>T]+[=] 52.94

Hello cmccabe,

Could you please try following and let me know if this helps you.

awk 'FNR==NR{A[$2]=$0;next} ($2 in A){if($10>30 && $11>49){print A[$2]}}'  Input_file1  Input_file2

So in above code you could put -F"\t" and OFS="\t" if your Input_files are tab delimited(They doesn't seems to be in your post). Also if you want to take output into a output file then change print A[$2] >> "output_file" . Let us know how it goes then.

Thanks,
R. Singh

1 Like

Either of the commands in post#1 works for me, given all spaces are removed from $2 in both files.

1 Like

Works great.... thank you very much :).