The below awk
is supposed filter $8
of example.txt
using the each line in gene.txt
. I think it is but why is it renumbering the 1,2,3 in $1
to 28,29,394? I have attached the data as it is large, example.txt
is the file to be searched, gene.txt
has the lines to match, and filtered.txt
is the current output. The desured output is just that but with $1
or R_Index
sequentially numbered. Thank you :).
awk
awk 'NR==FNR{for (i=1;i<=NF;i++) a[$i];next} FNR==1 || ($8 in a)' gene.txt example.txt | awk '{split($2,a,"-"); print a[1] "\t" $0}' | cut -f2-> filtered.txt