Compare two files with different number of records and output only the Extra records from file1

Hi Freinds ,

I have 2 files .

File 1

|nag|HYd|1|Che
|esw|Gun|2|hyd
|pra|bhe|3|hyd
|omu|hei|4|bnsj
|uer|oeri|5|uery

File 2

|nag|HYd|1|Che
|esw|Gun|2|hyd
|uer|oi|3|uery

output :

|pra|bhe|3|hyd
|omu|hei|4|bnsj

file2 may contain some difference but the output should be only the lines which are present in file1 and not present in file2

Please help friends. I am very new to Shell scriting

I used

Comm -23

command but i didnt get the exact output.

Please help :(:frowning: :frowning:

fgrep -v -f file2 file1
1 Like

Hi itKamaraj,

Thanks so much for the quick reply. I tried your command and got he results as below.

file1.txt

|hyd|che|1|2
|jun|out|2|1
|lok|ter|3|6

file2.txt

|hyd|che|1|2
|jun|out|3|1

Actual Output :

|jun|out|2|1
|lok|ter|3|6

Expected Output:

|lok|ter|3|6

Here if i use frep -v -f i am getting the output where i have difference . The expected output should not contain the difference lines it should cntain only the extra lines

Thanks in advance. please help.

Hi

As per your requirement, the expected output should also contain "|uer|oeri|5|uery".  comm -23 should get this for you.

Guru.

1 Like

@OP: It seems to me that you are looking for a solution where not all columns are significant, but then you would need to tell us which columns determine whether records are "equal" .

1 Like

@Scrutinizer, sorry i think i have confused you. let me explain in detail .

Expected Output:

I want to compare 2 files let say file1.txt( record count 4) and file 2.txt ( record 3 count) .

file1 :

|hyd|che|1|2
|jun|out|2|1
|lok|ter|3|6
|ing|tut|4|8

file2.txt

|hyd|che|1|2
|jun|out|3|1
|ing|tut|4|8

i want to compare 2 files line by line. the Expected output should contain the lines which are present in file1.txt but not in file2.txt. if i use command comm -23

|jun|out|2|1
|lok|ter|3|6

I got the above output because there is a difference in line 2 and one more line is missing in file1.txt.

But i need to get only the |lok|ter|3|6 , where this line is present in file1.txt but not in file2.txt.

Hope this helps you . please look into it.. I am very new to Unix :frowning:

The jun|out records are not the same, so that record gets printed. Yet you do not want it to get printed, so you need to tell us on which basis they should be considered equal.. Is it because of column 1, 2, 3? Or a combination?

@Scrutinizer : Sorry once again.. The requirement is file1.txt and file2.txt may contain differences in data. But here i am not looking for the difference.. I will atke the first 2 key fields like below

file 1.txt

1003|A|hyd|1|che
1004|B|che|2|gun
1004|A|kin|3|king
1005|C|opt|4|opr

file2.txt

1003|A|hyd|1|che
1004|B|che|2|gun
1005|C|opt|5|or

Expected Output :

1004|A|kin|3|king

The first two fields are static , i want to check if file1.line1.field1 and file1.line1.field2 present in file2. if it is not present then print the file1.line1 in the output.

in the example above file1.line3.field1 is not prsent in file2 . so the output should be file1.line3.field1

Please look into this.. if you have any additional information please let me know .

Try something like this:

awk -F\| 'NR==FNR{A[$1 FS $2]; next}!($1 FS $2 in A)' file2 file1
1 Like

Thank you so much @Scrutinizer its working as per requirement :slight_smile: :slight_smile: