I need to replace consecutive double quotes in a csv file, the data in the file is enclosed in double quotes but there are some places where the quotes are repeating
Example is below
Incoming data is :
"Pacific Region"|"PNG"|"Jimmy""|""|
Need output as:
"Pacific Region"|"PNG"|"Jimmy"|""|
Please Note that empty double quotes are perfectly valid here and I need to retain them
I tried the code, it is removing the consecutive quotes but it is also removing the empty strings, which I don't want because that is a perfectly valid value
The sample data you provided in post #1 shows the vertical bar character as a field terminator; not a field separator (since there is no data after the last vertical bar on a line). To produce the output you requested with sed , one could try:
For an easy substitution you can add a delimiter at the beginning and another one at the end of the line, and delete them afterwards.
sed '
# add extra delimiters
s/^/|/
s/$/|/
# replace all border "" by "
s/\([^|]\)""|/\1"|/g
s/|""\([^|]\)/|"\1/g
# delete the extra delimiters
s/^|//
s/|$//
' filename