The regular sed
command Scrutinizer suggested:
sed -e 's/^&\([^&]\)/\&\&\1/; s/\([^&]\)&$/\1\&\&/' -e :L -e 's/\([^&]\)&\([^&]\)/\1\&\&\2/g; t L' file
could also be written as:
sed -e 's/^&\([^&]\)/\&\&\1/
s/\([^&]\)&$/\1\&\&/
:L
s/\([^&]\)&\([^&]\)/\1\&\&\2/g
t L' file
The 1st substitute command ( s/^&\([^&]\)/\&\&\1/
) looks at the start of an input line ( ^
) for a literal ampersand character ( &
) followed by a single character that is not an ampersand ( [^&]
) remembering the character that matched (because it is between a pair of escaped parenthese ( \(
... \)
) and, if there was a match, replaces it with two literal ampersands ( \&\&
) and the string that was matched by the 1st expression found between escaped parenthese ( \1
).
The 2nd substitute command ( s/\([^&]\)&$/\1\&\&/
) performs equivalent logic looking for a match at the end of the line ( $
) instead of at the beginning of the line.
The :L
creates a label ( L
) in the script that can be branched to later.
The 3rd substitute command ( s/\([^&]\)&\([^&]\)/\1\&\&\2/g
) looks for a character that is not an ampersand followed by a literal ampersand followed by a character that is not an ampersand and replaces them with the 1st non-ampersand character, two literal ampersand characters, and the 2nd non0-ampersand character. The g
flag at the end of the command tells sed
to make this substitution as many times as it can on the line (without the g
flag, it would only make the 1st possible substitution). Note that with input like a&b&c&d&e
, this substitution will double the ampersand characters after the a
and c
, but not after the b
and d
. This is because the b
and d
were matched as the 2nd non-ampersand character after the a
and after the c
and can't also be matched as the 1st non-ampersand character in the b&c
and d&e
substrings in the input.
The transfer command ( t L
) tells sed
to transfer to the line in the script with the label L
if and only if a substitute command successfully matched and substituted text since the last t
command was executed. This lets it run the 3rd substitute command again if it made one or more substitutions the 1st time it was processed.
It wasn't clear to me from your discussion whether or not a line that contains just a single ampersand character and nothing else is supposed to double that ampersand. The script above does not double an ampersand in this case. If you do want to double an ampersand that is the only character on an input line, you could add another substitute command to take care of that case:
s/^&$/&&/
Note that the ampersands in the replacement pattern do not have to be escaped in this case. An unescaped &
in the replacement string in a substitute command is replaced by the entire string matched by the basic regular expression pattern in the substitute command. Since the string matched in this case is just an &
the replacement strings &&
and \&\&
produce identical results.