Comparing 2 files

Hi,
I have two files in the following format.
File 1

S00002583|NORFO|0002.20|MR|015542324A||BR|STD|201206|015542324A||E
S00004144|MIDDL|0014.90|MR|017120472D||VR|STD|201206|017120472D||E
S00005307|PLYMO|0002.20|MR|026187410A||P|STD|201206|026187410A||E
S00006040|SUFFO|0002.20|MR|012227071A||R|STD|201206|012227071A||E
S00017646|ESSEX|0009.60|MR|019036684A||BR|STD|201206|019036684A||E
S00019378|MIDDL|0013.40|MR|032072554A||VR|STD|201206|032072554A||E
S00025106|SUFFO|0009.60|MR|030201632A||R|STD|201206|030201632A||E
S00026461|MIDDL|-0006.10|MR|028447503A||R3|STD|201106|028447503A||E
S00026462|MIDDL|-0006.10|MR|028447502A||R3|STD|201107|028447502A||E
S00029844|ESSEX|0002.20|MR|030228184A||VR|STD|201206|030228184A||E
S00030343|ESSEX|0002.20|MR|033185658B||R|STD|201206|033185658B||E
S00037588|ESSEX|0007.50|MR|016241003A||R|STD|201206|016241003A||E
S00046838|MIDDL|0002.20|MR|025220396A||RK|STD|201206|025220396A||E
S00046948|PLYMO|0002.20|MR|012228991A||P|STD|201206|012228991A||E
S00047201|ESSEX|0002.20|MR|030228313A||R|STD|201206|030228313A||E
S00047205|MIDDL|-0007.30|MR|033220633A||P|STD|200907|033220633A||E
S00047785|MIDDL|0014.90|MR|026243695A||R|STD|201206|026243695A||E
S00048005|MIDDL|-0017.10|MR|022078063A||VR|STD|201205|022078063A||E
S00050356|ESSEX|0009.60|MR|200223079A||BR|STD|201206|200223079A||E
S00050497|ESSEX|0002.20|MR|024229648A||VR|STD|201206|024229648A||E
S00051590|NORFO|0009.60|MR|016242468A||R|STD|201206|016242468A||E

File 2 :

S00001006|0|20120731|32|MR|201207|E
S00001023|0|20090731|0|MR|200907|E
S00001028|0|20110131|0|MR|201101|E
S00001034|0|20110131|0|MR|201101|E
S00001042|0|20090431|0|MR|200904|E
S00001044|0|20100331|0|MR|201003|E
S00001046|0|20110731|0|MR|201107|E
S00001054|0|20121031|654.1|MR|201210|E
S00001058|0|20121031|625.8|MR|201210|E
S00001149|0|20121031|409.8|MR|201210|E
S00001153|0|20121031|654.1|MR|201210|E
S00001156|0|20121031|654.1|MR|201210|E
S00001167|0|20121031|654.1|MR|201210|E
S00001173|0|20060331|117|MR|200603|E
S00001181|0|20080431|-21.5|MR|200804|E
S00001182|0|20070431|404|MR|200704|E
S00001184|0|20110631|159.9|MR|201106|E
S00001196|0|20080831|-22|MR|200808|E
S00001231|0|20111131|759.4|MR|201111|E
S00029844|0|20090731|0|MR|200907|E
S00030343|0|20110131|0|MR|201101|E
S00037588|0|20110131|0|MR|201101|E
S00046838|0|20090431|0|MR|200904|E
S00046948|0|20100331|0|MR|201003|E
S00047201|0|20110731|0|MR|201107|E
S00047205|0|20121031|654.1|MR|201210|E
S00047785|0|20121031|625.8|MR|201210|E
S00048005|0|20121031|409.8|MR|201210|E
S00050356|0|20121031|654.1|MR|201210|E
S00050497|0|20121031|654.1|MR|201210|E
S00051590|0|20121031|654.1|MR|201210|E
S00053315|0|20060331|117|MR|200603|E
S00054151|0|20080431|-21.5|MR|200804|E
S00060160|0|20070431|404|MR|200704|E
S00046948|0|20110631|159.9|MR|201106|E
S00047201|0|20080831|-22|MR|200808|E
S00037588|0|20111131|759.4|MR|201111|E

I need to compare first columns of both the files and print only matching lines from file 2.

I tried the following code but doesn't help.

awk -F "|" '{A[$1,$1]=1;next} A[$1,$1]' FILE2 FILE1 > tst2

Desired output :

S00046948|0|20110631|159.9|MR|201106|E
S00047201|0|20080831|-22|MR|200808|E
S00037588|0|20111131|759.4|MR|201111|E

Any help is appreciated.

Gotta re-load unix on this PC, so this is all theory...

cut -d"|" -f1 <file1 >file1key
grep -f file1key <file2
1 Like
join -t"|" -1 1 -2 1 -o 1.1 1.2 1.3 1.4 1.5 1.6 1.7 file2 file1
1 Like

I did it before and here what i will use


awk -F\| 'NR==FNR{a[$1]++;next} (a[$1])' FILE1 FILE2 > FILE3.txt
cat file1 |cut -d"|" -f1  >Newfile_1
fgrep -f Newfile_1 file2 >Newfile