Conditional replacements

Hi,
I have a requirement as below
Input

  Jacuzzi,"Jet Rings, Pillows",Accessory,Optional,,9230917,69094,,P556805,69094,FALSE,1,0,,
  Jacuzzi,"Jet Rings, Pillows, Skirt/Apron",Accessory,Optional,,9230917,69094,,P556805,69094,FALSE,1,0,,

Output

  Jacuzzi,"Jet Rings!@% Pillows",Accessory,Optional,,9230917,69094,,P556805,69094,FALSE,1,0,,
  jacuzzi,"Jet Rings!@% Pillows!@% Skirt/Apron",Accessory,Optional,,9230917,69094,,P556805,69094,FALSE,1,0,,

i.e. commas present within the column which has double quotes has to be replaced by the !@%.

To achieve this I have used sed.

sed -e 's/\("[^"][^,]*\),\([^"]*\),\(
[^"]*[^,]"\)/\1!@%\2!@%\3/g' -e 's/\("[^"][^,]*\),\([^"]*[^,]"\)/\1!@%\2/g' "$file" 

But now the requirement has changed. Previously I was expecting only 1 or 2 commas i.e. a pattern like "abc,gbf" or "abc,kil,jik" can only come but now I have to generalize it for any no of commas i.e 1,2,3 ---

Any ideas about how to achieve the above.

Thanks for your help in advance.

Try this sed script

 
:up
s/\("[^,]*\),\(.*"\)/\1!@%\2/
t up

1 Like

Many thanks for posting the solution. It is working great .

---------- Post updated at 12:14 PM ---------- Previous update was at 11:50 AM ----------

Can you please explain the working ; I can understand that it is searching the entire part between quotes but not sure about the "t" & ":UP" & "UP"

How about this,

 perl -nle 'if(/(.*)(".+?")(.*)/){$vr=$3;printf $1;($repl=$2)=~ s/,/!@%/g; print $repl,$vr;}' inputfile
1 Like

Found the answer thanks to Google.
Pasting it below for ready reference in case someone is not aware of Sed Branching like me

$ sed ':label command(s) t label'

:label - specification of label.
commands - Any sed command(s)
label - Any Name for the label
t label - jumps to the label only if the last substitute command modified the pattern space. If label is not specified, then jumps to the end of the script. 
b label - jumps to the label with out checking any conditions. If label is not specified, then jumps to the end of the script.

---------- Post updated at 10:19 PM ---------- Previous update was at 01:11 PM ----------

The code using sed :up
s/\("[^,]*\),\(.*"\)/\1!@%\2/
t upwill not work properly in case the input record is of this type
Jacuzzi,"Jet Rings!@% Pillows",Accessory,"abc,def",Optional,,9230917,69094,,P556805,69094,FALSE,1,0,,
i.e if there are multiple columns with double quotes the sed command will fail.
any ideas how to over come this limitation.

Thanks for your help in advance.

1 Like

Use this in that case:

 
:up1
s/\("[^,]*\),\(.*"\),/\1!@%\2,/
t up1
:up2
s/\("[^,]*\),\(.*"\)$/\1!@%\2/
t up2

If you notice carefully, one double quotes either ends with comma or with end of line. Based on that we can pair up the double quotes. The above code is based on that logic.

1 Like

Input

yuzzi,"Jet Rings, Pillows, Skirt/Apron",Accessory,"abc,reg",,9230917,69094,,P556805,69094,FALSE,1,0,,

with

:up1
s/\("[^,]*\),\(.*"\),/\1!@%\2,/
t up1

the output was becoming

yuzzi,"Jet Rings!@% Pillows!@% Skirt/Apron"!@%Accessory!@%"abc!@%reg",,9230917,69094,,P556805,69094,FALSE,1,0,,

I have modified the code a little bit now it is working fine

:up1
  s/\("[^,"]*\)[^"],\(.*",\)/\1!@%\2/
  t up1

output

yuzzi,"Jet Ring!@% Pillow!@% Skirt/Apron",Accessory,"ab!@%reg",,9230917,69094,,P556805,69094,FALSE,1,0,,

Many thanks for helping me out :slight_smile: