filter record from a file reading another file

Hi,
I want to filter record from a file if the records in the second column matches the data in another file. I tried the below awk command but it filters the records in the filter file. I want the opposite, to include only the records in the filter file.

I tried this:

awk -F'|' 'NR==FNR{_[$1];next}!($2 in _)' filter.txt record.txt

Record.txt:

00000000048717|0003|20100408|0|001|000000.084|000000.093|000000.107|
00000000048717|0004|20100408|0|001|000000.084|000000.093|000000.107|
00000000048717|0005|20100408|0|001|000000.084|000000.093|000000.107|
00000000048717|0006|20100408|0|001|000000.084|000000.093|000000.107|

Filter file:

0003

Output should be:

00000000048717|0003|000000.084|

Note that the output needs only first, second and sixth colum from input record file.

Any help will be much appreciated.

awk -F'|' 'NR==FNR{_[$1];next}$2 in _ {print $1,$2,$6}' OFS='|' filter.txt record.txt
awk -F\| 'NR==FNR{f[$1];next}($2 in f){print $1"|"$2"|"$6}' filter.txt record.txt

---------- Post updated at 10:56 AM ---------- Previous update was at 10:31 AM ----------

vgersh99 I've learned something new today ;), I'm talking about

print $1,$2,$6

Thanks guys :slight_smile:

---------- Post updated at 12:08 PM ---------- Previous update was at 12:06 PM ----------

How about if I had to check for 2 columns in the Record file and I have two values in filter file.
for eg: Filter file:

0003|20100408

I should get the output only if the second column in the record file is 0003 and third column is 20100408.

awk -F'|' 'NR==FNR{_[$1,$2];next}($2 SUBSEP $3) in _ {print $1,$2,$6}' OFS='|' filter.txt record.txt

Then you must modify like this

awk -F\| 'NR==FNR{f[$1,$2];next}($2SUBSEP$3 in f){print $1"|"$2"|"$6}' filter.txt record.txt

I never made the try with $2,$3 instead $2SUBSEP$3 but you can make the try :wink:

This worked. Thanks to all of you who helped me :slight_smile:

Hi Kcoder,

Please explain this command process:

awk -F\| 'NR==FNR{f[$1,$2];next}($2SUBSEP$3 in f){print $1"|"$2"|"$6}' filter.txt record.txt

I found a probelm with this. Can somebody help please?

When the match is done, if the filter records position is less than the record files position this will not work.

For eg: Consider the below files
Record file:

0003@00000000000190@20100401@201004012010040120100401@003@
0003@00000000000200@20100329@201003292010032920100329@003@
0003@00000000000550@20100401@201004012010040120100401@048@
0003@00000000000910@20100401@201004012010040120100401@017@
0004@00000000001401@20100328@201003282010032820100328@300@
0004@00000000001401@20100401@201004012010040120100401@010@
0004@00000000001431@20100401@201004012010040120100401@010@
0005@00000000001441@20100329@201003302010033020100330@300@
0005@00000000001830@20100328@201004012010040120100401@300@
0005@00000000002080@20100329@201003292010032920100329@003@

Filter File:

0004@300@20100328@20100329
0005@300@20100328@20100329
awk -F\@ 'NR==FNR{f[$1,$3,$5];next}($1SUBSEP$3SUBSEP$2 in f) {print $0}' $Record $Filter

This will not work since we are comparing 1st position to 1st, 3rd to 3rd and 5th to 2nd.

But if the filter file was like below this code would have worked.

 awk -F\@ 'NR==FNR{f[$1,$3,$5];next}($1SUBSEP$3SUBSEP$5 in f) {print $0}' $Record $Filter 
0004@300@20100328@20100329@300
0005@300@20100328@20100329@300

Could somebody help me to figure out how we can make the first process work?