I was wondering if somebody could help me with something on UNIX. I have a file that looks like this -
"nelson,bill","bill","123 Main St","Mpls","MN",55444,8877,william
I want to replace all comma with pipes (|), except if the comma is within double quotes. (The first field is an example of this.) I can't do do a sed looking for "," and replacing that with a pipe because not all fields have double quotes around them. Side note - I do not have access to Perl, so that's not an option.
That worked for the sample that I gave you, but I thought of another scenario that I need to account for. What would happen if I added another comma to the end and then some more text? Sample -
"nelson,bill,jr","bill","123 Main St","Mpls","MN",55444,8877,william,bill
I tried modifying your sed command and couldn't figure out how to make it work.
Nelson,
Per my previous reply, your sample led me to believe the following:
1) Every non-numeric field would be surrounded by double quotes.
except for "william".
2) Every numeric field would not be surrounded by double quotes.
If the specs give reason to question, no final solution will be found.
Shell,
this is a typical CSV conversion task - you cannot make any assumptions of what type of FIELD you're dealing with.
The only assumptions you can make are:
This is a CSV-type file with ',' as a field separator
There can be embedded ',' characters (plural) in any field
Sorry Shell, I tried to come up with every situation in my head, but missed that one scenario. I'm trying to come up with a solution without having an actual test file from my customer yet, trying to be proactive.
Also, that last reply using nawk worked great. Thanks for all your help.