I would like to loop through all entries in column 1 of file 1 and if the string matches any entry in column 7 of file 2 to print out the line of file 2
I have tried:
awk 'NR == FNR { a[$1]++ } NR != FNR { for (e in a) for (i=1;i<NF;i++) if (e ~ $i) print $0 }' file1.txt file2.txt
but this doesn't seem to work.
My understanding is that NR will only == FNR when the first file is read in so this populates the 'a' array. Then when NR != FNR (eg when the second file is read in) then there is a loop to try to match every element of 'a' and if this matched to print out the line. I can't see how I can get this specific to column 7 in the second file??
I'm a complete beginner so any help would be really appreciated!
Thanks.
Welcome to the forum.
Thanks for (partly) using CODE tags, but please do so consistently.
Did you consder the links at the bottom left of this page? They usually offer a good starting point...
Howsoever, try
Thanks RudiC. It's not quite working and I think it might know why. I've realised column 7 in file two is actually wrapped in ""; So I'm guessing the strings won't exactly match? eg
Would this prevent the strings from matching? If so, how can I remove these before performing the match? I thought about trying to open file 2, remove them with sed somehow and pipe into the command that you suggested but I can't do this as awk takes in file2 at the end of the command???
It's very similar to tge RuduC's solution with the exception of of the FiledSeparator [FS] being double-quote when file2 is processed.
it works with the files you've posted so far. What files are passing through and what do you get as output?
Apologies, this is my first post. I thought it would be easier to explain with an ammended file but I've learnt this confuses things. My file 2 looks like this.
give file1 and file2 posted above, the code should be:
awk 'FNR==NR {f1[$1];next} $2 in f1' file1 FS='"' file2
The FieldSeparator (FS) for file2 is " . When file2 is processed ( $2 in f1 ), given FS="'" , $2 becomes the FIRST quoted string withOUT the quotes.
Not sure if my explanation is consumable tho