Help about comparison

Hello folks,

I have two files, which have usernames, I want to see the contents of file1.txt which is missing in file2.txt and another comparison file2.txt contents which is missing in file1.txt. please suggest.


file1.txt

user
u2
u8
a9
p9
p3
u4
z8
aaa
ahe
oktlo


file2.txt

myme
filep
u3
u4
z8

Hello,

Here is the code which may help you.

awk 'NR==FNR {a[$1];next} !($1 in a) {print $0 " is NOT present in file2"}' file2 file1

Output will be as follows.

user is NOT present in file2
u2 is NOT present in file2
u8 is NOT present in file2
a9 is NOT present in file2
p9 is NOT present in file2
p3 is NOT present in file2
aaa is NOT present in file2
ahe is NOT present in file2
oktlo is NOT present in file2

Thanks,
R. Singh

Try :

$ grep -v -f <(sort file2) <(sort file1)

Following link might be useful

Two question: remove from the other variable or file to get another variable or file by Akshay Hegde - Shell Programming and Scripting - Unix Linux Forums

Thanks, Is it possible while hardcoding "file2" in script we can automatically fetch by script itself?

---------- Post updated at 11:41 AM ---------- Previous update was at 11:39 AM ----------

It is wrong, not working for me.

---------- Post updated at 11:45 AM ---------- Previous update was at 11:41 AM ----------

I need to ask one question about this, in file1 i have 466 lines while file2 have 457 lines, so when i am comparing file1 with file2 so it should show 9 lines, while it is showing 12 lines why?

If you are trying on other than sample input you provided, you might get wrong result.

akshay@Aix:~/Desktop/s$ awk 'NR==FNR {a[$1];next} !($1 in a)' file2 file1 | sort
a9
aaa
ahe
oktlo
p3
p9
u2
u8
user

akshay@Aix:~/Desktop/s$ grep -v -f <(sort file2) <(sort file1)  | sort
a9
aaa
ahe
oktlo
p3
p9
u2
u8
user

In file1 and file2, both contents usernames, i want to compare the username which is present in file1 but not present in file2.

---------- Post updated at 12:08 PM ---------- Previous update was at 11:55 AM ----------

Is it possible i can match the contents in both files?

# Content in both file
$ awk 'NR==FNR {a[$1];next} ($1 in a)' file2 file1
u4
z8

# prints content of file1 which is not in file2 
$ awk 'NR==FNR {a[$1];next} !($1 in a)' file2 file1
user
u2
u8
a9
p9
p3
aaa
ahe
oktlo

It is a mistake to use regular expression matching when the "patterns" to be matched are not patterns but literal strings. Consider what would happen if one of the literal strings contains a regular expression metacharacter. Perhaps that's extremely unlikely with usernames, but it should be considered.

Even if regular expression matching is disabled, there is still the problem of substring matches, since there is no anchoring in effect.

To disable both regular expression matching (in favor of fixed string matching) and (assuming one username per line) to require whole-line matching to prevent substring matches:

grep -vxFf ...

There is no point in sorting when using grep. Whatever is gained (when a pattern near the head of the pattern list matches one of the first lines of data) is surrendered (when later matches consistently nearly-exhaust the pattern list) and sorting overhead is never recovered.

If you are going to sort the files, comm would be a more efficient alternative.

Regards,
Alister