awk to find differences between two file

I am trying to find the differences between the two sorted, tab separated, attached files. Thank you :).

In update2 there are 52,058 lines and in current2 there are 52,197 so 139 differences should result.

However,

awk 'FNR==NR{a[$0];next}!($0 in a)' update2 current2 > out2
comm -1 -3 update2 current2 > out2

just outputs the entire file not just the 139 different lines. Thank you :).

edit: current2 was not tab seperated as I thought, sorry..... everything works.

139 is the difference in lines between both files:

echo $(( $(wc -l < current2) - $(wc -l < update2) ))
139

What you appear to be requesting is to find the unique lines between both files if space is not significant. What it is not clear is what to do if there are repeated lines in the same file.

Here's a Perl version that output only lines found once after processing every line from both files, space agnostic.

perl -anle '$d{(join "\t", @F)}++;END{for(keys %d){print if $d{$_}==1}}' current2  update2 > difference.output
1 Like

Thank you very much :slight_smile: