How to make diff show differences one line at a time and not group them?

mmr11408 · January 11, 2012, 11:04am

Is there a way to tell diff to show differences one line at a time and not to group them? For example, I have two files:

file1:

line 1
line 2
line 3 diff
line 4 diff
line 5 diff
line 6
line 7

file2:

line 1
line 2
line 3 diff.
line 4 diff.
line 5 diff.
line 6
line 7

$ diff  -b -B -U 0 file1 file2
--- file1       2012-01-11 15:58:43.000000000 +0000
+++ file2       2012-01-11 15:59:14.000000000 +0000
@@ -3,3 +3,3 @@
-line 3 diff
-line 4 diff
-line 5 diff
+line 3 diff.
+line 4 diff.
+line 5 diff.

What I need is this:

-line 3 diff
+line 3 diff.
-line 4 diff
+line 4 diff.
-line 5 diff
+line 5 diff.

Corona688 · January 11, 2012, 11:09am

diff is designed to detect insertions and deletions of lines, not just simple changes of lines, and that format would leave a lot desired for that; what you want isn't diff, exactly.

I don't understand where your output comes from, either. You seem to be showing the string 'diff' exists in both files when it does not...

mmr11408 · January 11, 2012, 11:16am

The string "diff" does exist in both files (I created the files to show an example). I realize that diff has the ability to provide differences for the purpose of merging files, I am not interested in that. I am creating a report from output of diff and am filtering out the additional information that diff provides (e.g. ---, +++, @@). The report would be much more useful if the differences were shown as one line at a time instead of being grouped. If there is another command/tool that I can use instead of diff for that I am open to it. Thanks.

in2nix4life · January 11, 2012, 11:25am

Not sure if this is on the path you're trying travel down, but is this along the lines of what you're trying to achieve?

diff -y file1 file2 | grep '|' | sed 's/\s*|//g'

line 3 diff	line 3 diff.
line 4 diff	line 4 diff.
line 5 diff	line 5 diff.

mmr11408 · January 11, 2012, 11:30am

Thanks. I must have the lines on separate lines (the lines are long and are written into an HTML table for readability).

Corona688 · January 11, 2012, 11:34am

If the files are identical, then why is any difference reported?

mmr11408 · January 11, 2012, 12:01pm

The sample files that I created are not identical. One has extra periods (.) at the end.

methyl · January 11, 2012, 12:16pm

if the "+" or "-" at the start of your output line is important, ignore this post!

If the files are already in sorted order or you are prepared to sort the files for this purpose, then try the unix "comm" command. "man comm".

For the data sample posted there is a quick and dirty approach:

cat file1 file2|sort|uniq -u

line 3 diff
line 3 diff.
line 4 diff
line 4 diff.
line 5 diff
line 5 diff.

mmr11408 · January 11, 2012, 12:29pm

Thank you. The leading - and + are used to identify the entries coming from files and are important. I am going to try this and see if that works in all cases (I have multiple files that are being compared and reported).

comm -3 file1 file2 | sed 's/^/-/;s/^-\t/+/'

ahamed101 · January 11, 2012, 12:38pm

Try this... you got to test it out... and yeah this is dirty!

diff -y --suppress-common-lines file1 file2 | sed 's/|[ \t]*/\n+/;s/^[ \t]*>[\t ]*/+/;s/\(.*\)</-\1/;s/^[^-+]/-&/'

--ahamed

in2nix4life · January 11, 2012, 1:18pm

Ok, how about this one:

diff -y file1 file2 | grep '|' | sed 's/\s*|\s/\n/g' | awk '!(NR%2) {$0=$0"\n"} 1'
line 3 diff
line 3 diff.

line 4 diff
line 4 diff.

line 5 diff
line 5 diff.

mmr11408 · January 12, 2012, 4:02pm

Thanks for all your help. There was a little gotcha with each approach so I ended up using colors to distinguish between the entries from the files.

methyl · January 12, 2012, 5:38pm

There has been considerable interest in this thread.

Please post your final solution. The colours (or colors for USA folks) component of the solution is intriguing.

mmr11408 · January 30, 2012, 7:35am

The script compares different type of files (properties files that have many stanzas, list of stanzas, files that contain crontab listing, ...)

It creates an HTML email and I simply have it create a table and mark lines that start with '-' in one color and the ones starting with '+' a different color. As I am expanding its use, it would be ideal to have diff report differences one line at a time.