Identify rows in 2 text file that don't match

Hi there,

I am trying to identify rows in 2 text file that don't match. Here is my file structure:

file 1 rows = 81214108:

1 1:10177 0 10177 AC A
1 1:10235 0 10235 TA T
1 1:10352 0 10352 TA T

file 2 rows = 81141639:

1 1:10177 0 10177 AC A
1 1:10235 0 10235 TA T
1 1:10245 0 10245 G A

I tried this, but there's nothing in the output file. This can't be because the row #s are different and I know that there are rows that don't match. Any advice on making this work?

awk 'NR==FNR {exclude[$0];next} !($0 in exclude)' file1.txt file2.txt > no_match.txt

@ellie_story_2020, 2 points to remember when posting:

  1. Don't hijack other people threads - start your own.

  2. Please start using markdown code tags as required by the forum rules - you've been warned a number of times including in this thread. Consider it as a warning!!

1 Like

Ohh, I'm sorry. Scrutinizer had said this: "Please start using markdown code tags when posting code and data samples - this can potentially improve traction of your threads" and I thought that meant to reply to posts that were similar to get more people to read it. I misunderstood.

I just clicked on the markdown code link you sent me. I don't completely understand. Based on the link you sent me, should the post be in a specific format? I am thinking this is the case, but I believe that my post was generally in the format required. Alternatively, when you say use the markdown code tags, do you mean select a category when posting?

Thanks for your help and patience. Again, it was unintentional. I just didn't/don't understand the markdown codes so clarification around this would be great.

Hello ellie_story_2020,

Can you show us what you have tried so far and what your style is?

It might just be that you need to have a play with the diff command. It depends quite what you want to know and what output you can accept. There are various flags you can use to influence the output somewhat when comparins two files.

To help us help you, can you tell us the following:

  • What would your desired output be?
  • What shell are you using?
  • What os & version are you using (the output of uname -a would help)

Kind regards,
Robin

2 Likes

As per the header - I assume you are trying to compare both files and figure out which are non-matching

try this (Based on my assumption)

cat file1.txt file2.txt | sort | uniq -c | awk '$1 == 1 {$1=""; print "Unmatched Lines:" $0}'

Hello @Mannu25251, first of all Thank you for your good contribution in forums noticing that you are keep doing contributions with questions and answers both please keep it up :b:

Now coming on to our forums all members mutual understanding so if any user/forum advisor/mod/admin anyone asks a question to OP we usually respect that and wait for OP to get back to us with valuable information. This is just a information and you are new to forums so thought to made you aware of it.

Please keep doing good job of learning and sharing on forums cheers.

Thanks,
R. Singh

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.