Hi Scott,
thank you for checking this issue. your solution is working for most of the cases of my input file but its giving issue if the data as below.
Input:
123.5|ABC.|.|.|.
234.4|DEF|.|.|.|.|.|
Desired output :
123.5|ABC.|||.
234.4|DEF||||||
But your solution is giving the output as below.
123.5|ABC|||.
234.4|DEF||||||
So basically if the string ABC has included '.' then its not working as expected. So I thing we need to look for the entire string "|.|" instead of ".|"
With the g modifier, what matched once is out of scope for another match.
But a loop can do it
sed '
:L
s/|\.|/||/
tL
'
Another solution is perl: by using a look-ahead an RE substitution with g modifier will do it.
---------- Post updated at 03:10 ---------- Previous update was at 00:58 ----------
For completeness, here it is:
perl -pe 's/\|\.(?=\|)/|/g'
The (?= ) is the look-ahead; hard to remember, I always consult this tutorial.
Because the look-ahead is not part of the match, it must be not restored in the substitution.
Perl(version >= 5) uses an extended regular expression: | means "or", must be \ escaped.
Running the sed twice will also fixing the issue but the there are chances that the string will repeat more than twice also and hard to identify how many times the string will repeat in the file.
Hence used the perl solution which is working in all conditions.
Thank you so much all of you for the solution.
I didn't say to run sed twice; I said to run sed once using the global substitution twice. Running that global substitute twice will take care of ALL occurrences of the pattern you said you wanted to change. You might remember that the 2nd sample input you provided:
234.4|DEF|.|.|.|.|.|
contained 5 occurrences of the pattern you wanted to modify and that sed command produced exactly the output you said you wanted:
234.4|DEF||||||
removing all 5 occurrences of periods between vertical bars; not just two of them.