[/tmp]$ echo a.bc | sed -e "s/\|/\\|/g"
|a|.|b|c|
[/tmp]$
Is the behavior of the sed statement expected ? Or is this a bug in sed ?
OS details
Linux 2.6.9-55.0.0.0.2.ELsmp #1 SMP Wed May 2 14:59:56 PDT 2007 i686 i686 i386 GNU/Linux
[/tmp]$ echo a.bc | sed -e "s/\|/\\|/g"
|a|.|b|c|
[/tmp]$
Is the behavior of the sed statement expected ? Or is this a bug in sed ?
OS details
Linux 2.6.9-55.0.0.0.2.ELsmp #1 SMP Wed May 2 14:59:56 PDT 2007 i686 i686 i386 GNU/Linux
The problem is perhaps in the quotation you use: The shell is probably "eating" your escape chars away and sed doesn't see what you expect it to see.
I made it a habit to use always single quotes for sed-statements to avoid this. It is even possible to use single quotes when using a variable inside an sed-statement:
sed 's/'"$src"'/'"$tgt"'/g'
will change ocurrences of $src to the value of $tgt
I hope this helps.
bakunin
Fine. The single quotes vs double quotes has an impact.
[/tmp]$ echo a.bc | sed -e "s/\|/\\|/g"
|a|.|b|c|
[/tmp]$ echo a.bc | sed -e 's/\|/\\|/g'
\|a\|.\|b\|c\|
[/tmp]$
But that does not explain those extra characters in the output. In either case I would expect the output to be a.bc and not anything which has | or \| wrapped around every character.
Am I missing something something here ?
To be honest, now i'm astonished myself:
lacking a UNIX machine (got a day off) i fired up cygwin and tried:
# echo a.bc | sed --posix 's/\|/x/g'
xax.xbxcx
# echo 'a.bc' | sed --posix 's/\|/x/g'
xax.xbxcx
# echo 'a.bc' | sed --posix 's/|/x/g'
a.bc
# sed --version
GNU sed version 4.1.5
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE, to the extent permitted by law.
It seems that "\|" is matching the NULL-regexp, whereas "|" is matching a single pipe character as expected. I have no idea, why this is the case, but will investigate further.
bakunin
my guess:
\| is used in sed (gnu) as alternation. therefore
# echo "a.bc" | sed -e 's/\|/\\|/g'
\|a\|.\|b\|c\|
seem to say "blank" or "blank" (or null?) substitute with \|, hence the result.
if really want to search for a "|", use the open square brackets
# echo "a.bc" | sed -e 's/[|]/\\|/g'
a.bc
# echo "a|bc" | sed -e 's/[|]/\\|/g'
a\|bc
Wouldn't the alternation operator require atleast two operands ?
Humm, seems to be a GNU sed-ism
Under Interix 6.0 and ksh-93 and the OOB sed (non-GNU), it works as expected
$ echo a.bc | sed -e "s/\|/\\|/g"
a.bc
$
> echo a.bc | sed -e "s/\|/\\|/g"
a.bc
> uname -a
SunOS grape 5.10 Generic_125100-04 sun4u sparc SUNW,Sun-Fire-V440 Solaris
> env | grep SHELL
SHELL=/bin/bash
I tried the same under AIX 5.3:
# what /usr/bin/sed
/usr/bin/sed:
61 1.14 src/bos/usr/ccs/lib/libc/__threads_init.c, libcthrd, bos530 7/11/00 12:04:14
24 1.38 src/bos/usr/bin/sed/sed0.c, cmdedit, bos530 8/27/03 04:21:19
35 1.14.1.21 src/bos/usr/bin/sed/sed1.c, cmdedit, bos53D, d2005_18F0 5/5/05 03:34:10
# instfix -i | grep AIX_ML
All filesets for 5.3.0.0_AIX_ML were found.
Not all filesets for 5300-01_AIX_ML were found.
All filesets for 5300-02_AIX_ML were found.
All filesets for 5300-03_AIX_ML were found.
All filesets for 5300-04_AIX_ML were found.
# echo a.bc | sed -e 's/\|/\\|/g'
a.bc
# echo a.bc | sed -e "s/\|/\\|/g"
a.bc
Seems like you found a GNUism of GNU-sed. I find it rather interesting, that GNU-sed does it even with the "--posix"-flag. Isn't the flag supposed to turn all extensions off?
bakunin