Hi all,
I have 2 files, the first one containing a list of ids and the second one is a master file. I want to search each id from the first file from the 5th col in the second file. The 5th column in master file has values separated by ';', if not a single value is present.
Each id must occur just once as a sub-string of $5 in the master file (which is a 20GB file).
I realize that I can use
grep -m 1 -f idfile masterfile
but it takes awfully long and it will give me all other strings that contain the id.
I can also do
awk -F"\t" ' { if ($5 ~ /one_value/) print $0} ' masterfile
but I do not know how to pass a file instead of just a single value in awk.Also it does not solve the problem of returning all strings that contain the id instead of just 1 that I want.
Can there be a smarter and faster way to do this? use join maybe?
idfile
1
2
3
masterfile
a b c d 12;34;1 e
d e f g 67;2;567 h
i j k l 3;123 m
n o p q 321;231 r
s t u v 1223 x
output
1 a b c d 12;34;1 e
2 d e f g 67;2;567 h
3 i j k l 3;123 m