Simple script to find common strings in two files

Hi ,
I want to write a simple script.
I have two files

file1:

BCSpeciality
Backend
CB
CBAPQualDisp
CBCimsVFTRCK
CBDSNQualDisp
CBDefault
CBDisney
CBFaxMCGen
CBMCGeneral
CBMCQualDisp

file2:

CSpeciality
Backend
CB
CBAPQualDisp
CBCimsVFTRCK
CBDSNQualDisp
CBDefault
CBDisney
CBFaxMCGen
CBMCGeneral
CBMCQualDisp
CBPLQualDisp
CBQualNonCID
CBRColl
CBRecon
CBRepr2
CBRisk

if the line is present in both files then the line should be written to a third file; if it is not there is both files then it should be ignored.

$ ruby -ne 'BEGIN{a=File.read("file1").split(/\n+/)}; print $_ if a.include?($_.chomp)' file2
grep -f file1 file2

note the order of grep. file2 should come first.

yes, thank you to point that.

here is the code which no care of the files sequence.

awk 'NR==FNR{a[$1]++;next} a[$1] ' file1 file2

And that would have to be:

grep -Fxf file2 file1

otherwise BCSpeciality would get matched for example..

The order does not matter if you use:

awk 'NR==FNR{A[$0];next}$0 in A' file1 file2

S.

--
kurumi, I get:

-e:1: undefined local variable or method `a' for main:Object (NameError)

---------- Post updated at 07:16 ---------- Previous update was at 06:57 ----------

rdcwayx,
by using $1 instead of $0 awk would match words instead of lines. It could well be that is what the OP actually intended - in fact that would seem likely - so your awk would be better suited and then

grep -wFf file2 file1

would be needed, and my awk would become:

awk 'NR==FNR{A[$1];next}$1 in A' file1 file2
1 Like

For this task, the order is irrelevant; a successful match must only occur when the line is in both files.

The problem here is the use of regular expressions for what is a fixed-string job. The -f option without -F uses basic regular expressions. If they aren't wrapped with "^" and "$", they allow substring matches to occur. That's incorrect for this case. Matches must be whole lines.

To make matters worse, if a line contains a regular expression special character (such as a "."), it may match a character that is not its literal self. Properly escaping a file to protect against this is error prone.

The correct solution is to avoid regular expressions and instead use fixed strings (-F) that must match an entire line (-x).

grep -Fxf file1 file2

or

grep -Fxf file2 file1

They are interchangeable.

Regards,
Alister

1 Like

@scrutinizer, i have no problem. Using 1.9.1

$ ruby -ne 'BEGIN{a=File.read("file1").split(/\n+/)}; print $_ if a.include?($_.chomp)' file2
CB
CBAPQualDisp
CBCimsVFTRCK
CBDSNQualDisp
CBDefault
CBDisney
CBFaxMCGen
CBMCGeneral
CBMCQualDisp

OK, I am using 1.8.7p249

---------- Post updated at 07:42 ---------- Previous update was at 07:27 ----------

True, but it is perhaps good to note the order would be relevant if the OP meant to match words instead of lines and would use the -w option..