SED - replace only on part of the string

Sephiburp · January 13, 2012, 3:17pm

Hello there,
I need some help.

I have a file containing this :
$ cat file
PARM1=(VAL11),PARM2=(VAL21,VAL22,VAL23),PARM3=(VAL31),PARM4=(VAL41,VAL42)

and I need to replace all the ',' by '|' but only those which are between brackets.
Output would be :
PARM1=(VAL11),PARM2=(VAL21|VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

I almost find a solution with sed :
sed 's/$([^)]*$,/\1|/g' file

But not all the comma are replaced :
PARM1=(VAL11),PARM2=(VAL21,VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

Any idea ?

Sephi.

Corona688 · January 13, 2012, 3:23pm

I see a pattern here... All the commas you don't want replaced are right after a bracket. So we match a not-bracket with a comma after it, putting the not-bracket into the backreference, and putting a | where the comma was. So:

$ echo "PARM1=(VAL11),PARM2=(VAL21,VAL22,VAL23),PARM3=(VAL31),PARM4=(VAL41,VAL42)" | sed 's/\([^)]\),/\1|/g'

PARM1=(VAL11),PARM2=(VAL21|VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

$

in2nix4life · January 13, 2012, 3:24pm

How about this:

export PARM1="(VAL11),PARM2=(VAL21,VAL22,VAL23),PARM3=(VAL31),PARM4=(VAL41,VAL42)"

echo $PARM1 | sed 's/\b\(\,\)\b/|/g'

(VAL11),PARM2=(VAL21|VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

Corona688 · January 13, 2012, 3:27pm

That's a new one on me, what's \b?

Shell_Life · January 13, 2012, 3:53pm

Extracted from the 'sed' book:

Sephiburp · January 13, 2012, 5:37pm

Thanks for your reply.

@Corona688 : actually my first example was not correct. In fact, the data that I have to manage is more dirty. Some value is surrounded with brackets, quotes or both, or even nothing. Also, some comma might be surrounded by spaces...

So we should work with this kind of data :

echo "PARM1=('VAL11'),PARM2=(VAL21,VAL22,VAL23), PARM3='VAL31' ,PARM4=(VAL41,VAL42),PARM5 = 'VAL51',PARM6=(VAL61)"

Your sed does not working anymore :

echo "PARM1=('VAL11'),PARM2=(VAL21,VAL22,VAL23), PARM3='VAL31' ,PARM4=(VAL41,VAL42),PARM5 = 'VAL51',PARM6=(VAL61)" | sed 's/\([^)]\),/\1|/g'
PARM1=('VAL11'),PARM2=(VAL21|VAL22|VAL23), PARM3='VAL31' |PARM4=(VAL41|VAL42),PARM5 = 'VAL51'|PARM6=(VAL61)

@in2nix4life : your solution does not work for me, no comma is changed.
I tried other way with \b but failed...

Sephi.

Shell_Life · January 13, 2012, 5:45pm

See if this works for you:

sed 's/\([A-Z0-9]\),/\1|/g' File

Scrutinizer · January 14, 2012, 4:00am

See if awk works for you:

awk -F\) '{gsub(/,/,"|",$1)}1' RS=\( ORS=\( OFS=\)

\b is not standard sed, but GNU sed only

Sephiburp · January 14, 2012, 4:39am

It works !

But I don't understand all the parts.

First I do some change so that it's easier to understand :

awk '{gsub(",","|",$1)}1' FS=\) RS=\( OFS=\) ORS=\(

Well I understand that we separate each record with '(' and each field with ')'.
So in the first field of each record we are sure that all comma must be replaced by pipe.

But I don't understand the '1' in {gsub(",","|",$1)}1

_________________________________________________________________________________
EDIT

In fact there is one thing to fix : if the last character of a line is not a ')', the next line is merged :

$ cat file
PARMA1=('VAL11'),PARMA2=(VAL21 ), PARMA3= VAL31 ,PARMA4= VAL41 ,PARMA5 = 'VAL51',PARMA6=(VAL61)
PARMB1=('VAL11'),PARMB2=(VAL21 , VAL22,VAL23), PARMB3= VAL31 ,PARMB4=(VAL41,VAL42),PARMB5 = 'VAL51',PARMB6=(VAL61)
PARMC1=('VAL11'),PARMC2=(VAL21 ), PARMC3= VAL31 ,PARMC4= VAL41 ,PARMC5 = 'VAL51',PARMC6=(VAL61)
PARMD1=('VAL11'),PARMD2=(VAL21 , VAL22,VAL23), PARMD3= VAL31 ,PARMD4=(VAL41,VAL42),PARMD5 = 'VAL51',PARMD6='VAL61'
PARME1=('VAL11'),PARME2=(VAL21 ), PARME3= VAL31 ,PARME4= VAL41 ,PARME5 = 'VAL51',PARME6=(VAL61)

$ cat file | awk '{gsub(",","|",$1)}1' FS=\) RS=\( OFS=\) ORS=\(
PARMA1=('VAL11'),PARMA2=(VAL21 ), PARMA3= VAL31 ,PARMA4= VAL41 ,PARMA5 = 'VAL51',PARMA6=(VAL61)
PARMB1=('VAL11'),PARMB2=(VAL21 | VAL22|VAL23), PARMB3= VAL31 ,PARMB4=(VAL41|VAL42),PARMB5 = 'VAL51',PARMB6=(VAL61)
PARMC1=('VAL11'),PARMC2=(VAL21 ), PARMC3= VAL31 ,PARMC4= VAL41 ,PARMC5 = 'VAL51',PARMC6=(VAL61)
PARMD1=('VAL11'),PARMD2=(VAL21 | VAL22|VAL23), PARMD3= VAL31 ,PARMD4=(VAL41|VAL42),PARMD5 = 'VAL51',PARMD6='VAL61')PARME1=('VAL11'),PARME2=(VAL21 ),.........

Sephi.

Scrutinizer · January 14, 2012, 5:31am

Hi, 1 effectuates "print record". A value of 1 outside the brackets makes awk invoke the default action, which is {print $0}.

Strange I cannot reproduce this effect. I get:

PARMA1=('VAL11'),PARMA2=(VAL21 ), PARMA3= VAL31 ,PARMA4= VAL41 ,PARMA5 = 'VAL51',PARMA6=(VAL61)
PARMB1=('VAL11'),PARMB2=(VAL21 | VAL22|VAL23), PARMB3= VAL31 ,PARMB4=(VAL41|VAL42),PARMB5 = 'VAL51',PARMB6=(VAL61)
PARMC1=('VAL11'),PARMC2=(VAL21 ), PARMC3= VAL31 ,PARMC4= VAL41 ,PARMC5 = 'VAL51',PARMC6=(VAL61)
PARMD1=('VAL11'),PARMD2=(VAL21 | VAL22|VAL23), PARMD3= VAL31 ,PARMD4=(VAL41|VAL42),PARMD5 = 'VAL51',PARMD6='VAL61'
PARME1=('VAL11'),PARME2=(VAL21 ), PARME3= VAL31 ,PARME4= VAL41 ,PARME5 = 'VAL51',PARME6=(VAL61)

Are you on Solaris? If so use nawk or /usr/xpg4/bin/awk

Sephiburp · January 14, 2012, 5:51am

I'm on Linux Suse.
I tried with nawk but it's the same result.
Then I tried with gawk and it works !
Thank you.

Sephi.