Ah.... I wondered how it cut so many lines. It worked for the cases I looked for, but I should have looked closer.
No, that won't do at all. I need to inspect each line, and compare it to the next.
Can you tell me why what I have isn't working for me? I'm trying to take the first line, see if the second line is the first with any more characters, and then if so, replace the whole thing with the second line.
Woops. Obviously I missed the crux of the problem. Apologies for the noise.
---------- Post updated at 09:33 PM ---------- Previous update was at 09:26 PM ----------
How about:
sed 'N; /^\([^\n]\{1,\}\)\n\1/s/.*\n//'
That will inspect a pair of lines to see if the first is a leading substring of the next. Then it moves on to the next pair of lines. There is no overlap between pairs. 1-2, 3-4, 5-6, etc ... are inspected, not 1-2, 2-3, 3-4. Your problem statement wasn't explicit in this regard, so I chose the simpler of the two to implement.
Example:
$ cat data
ab
abcd
12
345
ef
efgh
$ sed 'N; /^\([^\n]\{1,\}\)\n\1/s/.*\n//' data
abcd
12
345
efgh
cat infile
XXXX
ab
abcd
sed 'N; /^\([^\n]\{1,\}\)\n\1/s/.*\n//' infile
XXXX
ab
abcd
cat infile
ab
abcd
sed 'N; /^\([^\n]\{1,\}\)\n\1/s/.*\n//' infile
abcd
---------- Post updated at 11:52 AM ---------- Previous update was at 11:49 AM ----------
and if you see this situation, what's your expect output?
With this implementation, an empty line is never a match. Perhaps it should be. In any case, so long as the handling of empty lines is left unspecified, I'm fine with its level of correctness. I'll propose the simplest solution unless there is an explicit requirement demanding something more complicated.
Your sed's results differ from mine (though the sed code itself is posix-compliant). Note: there should be two blank lines at the end of the output, but for some reason the forum's markup is eating them.
$ cat data
XXXX
ab
abcd
$ sed 'N; /^\([^\n]\{1,\}\)\n\1/s/.*\n//' data
XXXX
abcd
As I stated in my previous post, the lines are compared in non-overlapping pairs.
$ cat data
ab
abc
abcd
abcde
$ sed 'N; /^\([^\n]\{1,\}\)\n\1/s/.*\n//' data
abc
abcde
---------- Post updated at 11:29 PM ---------- Previous update was at 10:27 PM ----------
Here's a version that compares overlapping pairs of lines (1-2, 2-3, 3-4, etc) and considers a blank line to always be a leading substring of the following line (i.e. blank lines are discarded).
sed -n '1{h;d;}; H; x; /^\([^\n]*\)\n\1/!s/\n.*//p; ${g;/./p;}'