Print lines matching value(s) in other file using awk

SBC · April 26, 2010, 6:21pm

Hi,

I have two comma separated files. I would like to see field 1 value of File1 exact match in field 2 of File2. If the value matches, then it should print matched lines from File2. I have achieved the results using cut, paste and egrep -f but I would like to use awk as it is efficient way and would save generating multiple files to get the end result. I will really appreciate help in this regard. Thanks!

File1

123,1,123,...
12,1,456,...
1234,12,798,...

File2 

11,1,"ABC"
111,12,"RTF"
1,1234,"ESC"

Output

111,12,"RTF"
1,1234,"ESC"

vgersh99 · April 26, 2010, 6:42pm

nawk -F, 'FNR==NR{f1[$1];next}$2 in f1' OFS=, file1 file2

SBC · April 27, 2010, 3:07am

Thanks vgresh. It worked! I was trying similar code since couple of days with the help of information I found in different search results and I was able to store the value in array but cannot get it referenced to the second file. I understand my mistake now. Thanks alot!

---------- Post updated 04-27-10 at 12:07 AM ---------- Previous update was 04-26-10 at 03:58 PM ----------

Hi Vgersh,

If I would like to print the lines matched in file2 comma separated in file1 side by side then what change will help? I tried the following but its only print file2 match entries.

Desired Output
12,1,456,111,12,"RTF"
1234,12,789,1,1234,"ESC"

awk -F, 'FNR==NR{f1[$1]=$1;next}$2 in f1{print f1[$1],$0}' OFS="," file1 file2

I only gets file2 matched entries in this case as array print blank value for second file.

Thanks for the help in advance!

ahmad.diab · April 27, 2010, 3:53am

sbc:

Thanks vgresh. It worked! I was trying similar code since couple of days with the help of information I found in different search results and I was able to store the value in array but cannot get it referenced to the second file. I understand my mistake now. Thanks alot!

---------- Post updated 04-27-10 at 12:07 AM ---------- Previous update was 04-26-10 at 03:58 PM ----------

Hi Vgersh,

If I would like to print the lines matched in file2 comma separated in file1 side by side then what change will help? I tried the following but its only print file2 match entries.
Desired Output
12,1,456,111,12,"RTF"
1234,12,789,1,1234,"ESC"
awk -F, 'FNR==NR{f1[$1]=$1;next}$2 in f1{print f1[$1],$0}' OFS="," file1 file2
I only gets file2 matched entries in this case as array print blank value for second file.

Thanks for the help in advance!

awk -F, 'FNR==NR{f1[$1]=$0;next}$2 in f1{print f1[$2],$0}' OFS="," file1 file2

SBC · April 28, 2010, 12:54pm

Thanks Ahmad and Vgersh for the response. Its really helpful.

SBC · May 3, 2010, 1:08am

Hi,

I'm working on my script and trying to develop understanding of array association. Following for my own reference for the records which are not found in file2, print entries of file1 using Vgersh script:

nawk -F, 'FNR==NR{f1[$1];next} !($2 in f1)' OFS=, file1 file2

---------- Post updated at 10:08 PM ---------- Previous update was at 09:58 PM ----------

Hi,

I have got the results using above awk commands. Now I would like to perform comparison among the fields to look for specific set of characters. Based on the entry in field4 if it look for that entry in field 7 on the same line and see if it find the record then print "Record Match" otherwise "Difference in record". Please note that the entries in field7 contains longer string so the entry in field4 will be part of the complete string in field7. Furthermore, there is no specific start position of the string in field7.

Sample Input file:

12,1,456,RTF,111,12,PROG-RTF 12
1234,12,798, ESC,1,1234,ENTY ESC 345
456,1,886,ABC,434,567,YTRU-POYH 765

Sample Output File:

12,1,456,RTF,111,12,PROG-RTF 12 || Record Match
1234,12,798, ESC,1,1234,ENTY ESC 345 || Record Match
456,1,886,ABC,434,567,YTRU-POYH 765 || Difference in record

Thanks in advance for your help!

malcomex999 · May 3, 2010, 2:07am

Try...

awk -F, '{for(i=0;++i<=length($7);)
if($4==substr($7,i,length($4))){print $0"|| Matched";next}
{print $0"|| Not Matched"}
}' infile

Franklin52 · May 3, 2010, 3:56am

Another approach:

awk -F, '
match($7,$4){print $0 " || Record Match";next}
{print $0 " || Difference in record"}' file