SED - replace only on part of the string

Hello there,
I need some help.

I have a file containing this :
$ cat file
PARM1=(VAL11),PARM2=(VAL21,VAL22,VAL23),PARM3=(VAL31),PARM4=(VAL41,VAL42)

and I need to replace all the ',' by '|' but only those which are between brackets.
Output would be :
PARM1=(VAL11),PARM2=(VAL21|VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

I almost find a solution with sed :
sed 's/\(([^)]*\),/\1|/g' file

But not all the comma are replaced :
PARM1=(VAL11),PARM2=(VAL21,VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

Any idea ?

Sephi.

I see a pattern here... All the commas you don't want replaced are right after a bracket. So we match a not-bracket with a comma after it, putting the not-bracket into the backreference, and putting a | where the comma was. So:

$ echo "PARM1=(VAL11),PARM2=(VAL21,VAL22,VAL23),PARM3=(VAL31),PARM4=(VAL41,VAL42)" | sed 's/\([^)]\),/\1|/g'

PARM1=(VAL11),PARM2=(VAL21|VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

$

How about this:

export PARM1="(VAL11),PARM2=(VAL21,VAL22,VAL23),PARM3=(VAL31),PARM4=(VAL41,VAL42)"

echo $PARM1 | sed 's/\b\(\,\)\b/|/g'

(VAL11),PARM2=(VAL21|VAL22|VAL23),PARM3=(VAL31),PARM4=(VAL41|VAL42)

That's a new one on me, what's \b?

Extracted from the 'sed' book:

Thanks for your reply.

@Corona688 : actually my first example was not correct. In fact, the data that I have to manage is more dirty. Some value is surrounded with brackets, quotes or both, or even nothing. Also, some comma might be surrounded by spaces...

So we should work with this kind of data :

echo "PARM1=('VAL11'),PARM2=(VAL21,VAL22,VAL23), PARM3='VAL31' ,PARM4=(VAL41,VAL42),PARM5 = 'VAL51',PARM6=(VAL61)"

Your sed does not working anymore :cry: :

echo "PARM1=('VAL11'),PARM2=(VAL21,VAL22,VAL23), PARM3='VAL31' ,PARM4=(VAL41,VAL42),PARM5 = 'VAL51',PARM6=(VAL61)" | sed 's/\([^)]\),/\1|/g'
PARM1=('VAL11'),PARM2=(VAL21|VAL22|VAL23), PARM3='VAL31' |PARM4=(VAL41|VAL42),PARM5 = 'VAL51'|PARM6=(VAL61)

@in2nix4life : your solution does not work for me, no comma is changed.
I tried other way with \b but failed...

Sephi.

See if this works for you:

sed 's/\([A-Z0-9]\),/\1|/g' File

See if awk works for you:

awk -F\) '{gsub(/,/,"|",$1)}1' RS=\( ORS=\( OFS=\)

\b is not standard sed, but GNU sed only

1 Like

It works !

But I don't understand all the parts.

First I do some change so that it's easier to understand :

awk '{gsub(",","|",$1)}1' FS=\) RS=\( OFS=\) ORS=\( 

Well I understand that we separate each record with '(' and each field with ')'.
So in the first field of each record we are sure that all comma must be replaced by pipe.

But I don't understand the '1' in {gsub(",","|",$1)}1

_________________________________________________________________________________
EDIT

In fact there is one thing to fix : if the last character of a line is not a ')', the next line is merged :

$ cat file
PARMA1=('VAL11'),PARMA2=(VAL21 ), PARMA3= VAL31 ,PARMA4= VAL41 ,PARMA5 = 'VAL51',PARMA6=(VAL61)
PARMB1=('VAL11'),PARMB2=(VAL21 , VAL22,VAL23), PARMB3= VAL31 ,PARMB4=(VAL41,VAL42),PARMB5 = 'VAL51',PARMB6=(VAL61)
PARMC1=('VAL11'),PARMC2=(VAL21 ), PARMC3= VAL31 ,PARMC4= VAL41 ,PARMC5 = 'VAL51',PARMC6=(VAL61)
PARMD1=('VAL11'),PARMD2=(VAL21 , VAL22,VAL23), PARMD3= VAL31 ,PARMD4=(VAL41,VAL42),PARMD5 = 'VAL51',PARMD6='VAL61'
PARME1=('VAL11'),PARME2=(VAL21 ), PARME3= VAL31 ,PARME4= VAL41 ,PARME5 = 'VAL51',PARME6=(VAL61)

$ cat file | awk '{gsub(",","|",$1)}1' FS=\) RS=\( OFS=\) ORS=\(
PARMA1=('VAL11'),PARMA2=(VAL21 ), PARMA3= VAL31 ,PARMA4= VAL41 ,PARMA5 = 'VAL51',PARMA6=(VAL61)
PARMB1=('VAL11'),PARMB2=(VAL21 | VAL22|VAL23), PARMB3= VAL31 ,PARMB4=(VAL41|VAL42),PARMB5 = 'VAL51',PARMB6=(VAL61)
PARMC1=('VAL11'),PARMC2=(VAL21 ), PARMC3= VAL31 ,PARMC4= VAL41 ,PARMC5 = 'VAL51',PARMC6=(VAL61)
PARMD1=('VAL11'),PARMD2=(VAL21 | VAL22|VAL23), PARMD3= VAL31 ,PARMD4=(VAL41|VAL42),PARMD5 = 'VAL51',PARMD6='VAL61')PARME1=('VAL11'),PARME2=(VAL21 ),.........

Sephi.

Hi, 1 effectuates "print record". A value of 1 outside the brackets makes awk invoke the default action, which is {print $0}.

Strange I cannot reproduce this effect. I get:

PARMA1=('VAL11'),PARMA2=(VAL21 ), PARMA3= VAL31 ,PARMA4= VAL41 ,PARMA5 = 'VAL51',PARMA6=(VAL61)
PARMB1=('VAL11'),PARMB2=(VAL21 | VAL22|VAL23), PARMB3= VAL31 ,PARMB4=(VAL41|VAL42),PARMB5 = 'VAL51',PARMB6=(VAL61)
PARMC1=('VAL11'),PARMC2=(VAL21 ), PARMC3= VAL31 ,PARMC4= VAL41 ,PARMC5 = 'VAL51',PARMC6=(VAL61)
PARMD1=('VAL11'),PARMD2=(VAL21 | VAL22|VAL23), PARMD3= VAL31 ,PARMD4=(VAL41|VAL42),PARMD5 = 'VAL51',PARMD6='VAL61'
PARME1=('VAL11'),PARME2=(VAL21 ), PARME3= VAL31 ,PARME4= VAL41 ,PARME5 = 'VAL51',PARME6=(VAL61)

Are you on Solaris? If so use nawk or /usr/xpg4/bin/awk

1 Like

I'm on Linux Suse.
I tried with nawk but it's the same result.
Then I tried with gawk and it works ! :slight_smile:
Thank you.

Sephi.