How to replace string between delimiters?

Hello,
I would need to replace a delimiter in a flat file using.I would like to replace the semicolon (";") but only if it was contained in a string between quotes. For example:

Original flat file example:

abc;abc;"abc;abc";cd;"ef;ef";abc
aa;bb;"aa";cc;"ddd;eee";ff

Desired output:

abc;abc;"abc|abc";cd;"ef|ef";abc
aa;bb;"aa";cc;"ddd|eee";ff

How could I do it using sed or awk?
Thanks in advance.

Such things have been asked several times ago.
One of the frequently given answers was

awk 'NR%2==0 { gsub(";","|") } 1' RS='"' ORS='"' file

The trick is to set the Record Separator to " .
For clarification how the "lines" are seen do this

awk '1' RS='"' file

The rest is easy: on every second line replace all ; by |

When I try MadeInGermany's code, I get an extraneous " at the end of the file and the resulting file is not a text file (since it is missing a trailing <newline> character after the extraneous " ).

One possible alternative is:

awk '
BEGIN {	FS = OFS = ";"
}
{	for(i = 1; i <= NF; i++) {
		nq += gsub(/\"/, "&", $i)
		printf("%s%s", $i, (i == NF) ? "\n" : (nq % 2) ? "|" : OFS)
	}
}' file

which uses semicolons as the field separators and replaces field separators found inside double quotes with pipe symbols. If file contains:

abc;abc;"abc;abc";cd;"ef;ef";abc
aa;bb;"aa";cc;"ddd;eee";ff
a;"b";"c;d";"e;f;g"
a;"b";"c;d";"e;f;g";h

it produces the output:

abc;abc;"abc|abc";cd;"ef|ef";abc
aa;bb;"aa";cc;"ddd|eee";ff
a;"b";"c|d";"e|f|g"
a;"b";"c|d";"e|f|g";h

Wouldn't just this do?

awk '{for (i=2; i < NF; i+=2) gsub (/\;/, "|", $i)} 1' FS=\" OFS=\" file
abc;abc;"abc|abc";cd;"ef|ef";abc
aa;bb;"aa";cc;"ddd|eee";ff
a;"b";"c|d";"e|f;g"
a;"b";"c|d";"e|f;g";h
1 Like

In sed:

sed 's/\("[a-z]*\);\([a-z]*"\)/\1|\2/g' file 
perl -pe 's/(;"\w+);/$1|/g' bartleby.file

Output:

abc;abc;"abc|abc";cd;"ef|ef";abc
aa;bb;"aa";cc;"ddd|eee";ff

It works as long as there is no more than one semicolon between double quotes, but it doesn't work with input like:

a;"b";"c;d";"e;f;g"
a;"b";"c;d";"e;f;g";h

producing the output:

a;"b";"c|d";"e;f;g"
a;"b";"c|d";"e;f;g";h

instead of:

a;"b";"c|d";"e|f|g"
a;"b";"c|d";"e|f|g";h

and Aia's perl script produces:

a;"b";"c|d";"e|f;g"
a;"b";"c|d";"e|f;g";h

with the same input.

By design. Based on OP example and words; not based on your hypothetical input.
You should start a new thread if you would like me to provide a solution for you, addressing your input.

It is my belief that the original request was to change semicolon delimiters inside double-quoted strings to vertical bar delimiters. But, the given examples did not include a case where more than on delimiter appeared inside double quotes. Therefore, the original request is not as clear as I would like.

I did not ask you to make any changes.

I see no reason to start a new thread when pointing out to the original poster that the code suggested by you, the code suggested by greet_sed, and the code suggested by me (and refined to a better suggestion by RudiC) do three different things if there is more than one semicolon inside double quotes: Your suggestion changes the 1st semicolon, greet_sed's suggestion doesn't change anything, and my and RudiC's suggestion changes every semicolon.

I believe it is entirely up to bartleby to determine if any of this matters in the real input he/she will be processing (instead of the contrived 2-line sample input) and, if it does matter, to choose the suggestion that behaves the way bartleby wants it to behave. But, I believe that bartleby should be made aware in this thread that the three suggestions do not do the same thing and should be warned to determine whether the differences in our solutions matter.

I do not doubt in your belief. However, by your own words, your belief made you assume with what you were not clear about, to start with.
I chose to interpret the request as stated:

One semicolon. Single semicolon reference, in both instances, highlighted in red, follow by specific post:

It was clear to me as stated. My solution was designed based on the given example, not based on requirements conceived by me. Perhaps, it would had been more understandable if you were to ask for direct clarification to the OP, excluding other suggestions not given by you. Instead of giving an audit of other suggestions, based on your assumptions and leaving there to anyone to interpret.

Not quite accurate neither. You are still declaring it based on your belief.
My suggestion will replace any semicolon which precedes another semicolon, followed by a literal double quotes, followed by one or more alpha-numberic characters. Which will be applicable to the shown example and it will not provide the expected results if the shown examples are not representative. If that's the case, the OP will communicate back if he or she is interested on the suggestion.