Help Replacing Characters in Flat File

I was wondering if somebody could help me with something on UNIX. I have a file that looks like this -

"nelson,bill","bill","123 Main St","Mpls","MN",55444,8877,william

I want to replace all comma with pipes (|), except if the comma is within double quotes. (The first field is an example of this.) I can't do do a sed looking for "," and replacing that with a pipe because not all fields have double quotes around them. Side note - I do not have access to Perl, so that's not an option.

Nelson,
See if this works for you:

sed -e 's/",/"|/g' -e 's/\([0-9]\),/\1|/g' input_file

Shell,
won't work for this pattern:

"nelson,bill","bill","123 Main St","Mpls","MN",55444,8877,william,foo

Vgersh,
I agree with you, but the sample is not precise as it has two strings
treated in two different ways:

I tried to solve the problem based on the sample data.
Thank you for analizing it.

That worked for the sample that I gave you, but I thought of another scenario that I need to account for. What would happen if I added another comma to the end and then some more text? Sample -

"nelson,bill,jr","bill","123 Main St","Mpls","MN",55444,8877,william,bill

I tried modifying your sed command and couldn't figure out how to make it work.

echo '"nelson,bill,jr","bill","123 Main St","Mpls","MN",55444,8877,william,bill' | nawk -f doCSV.awk

doCSV.awk:

BEGIN { FS=SUBSEP; OFS="|" }

{
  result = setcsv($0, ",")
  print
}

function setcsv(str, sep, i) {
  gsub(/""/, "\035", str)
  gsub(sep, FS, str)

  while (match(str, /"[^"]*"/)) {
    middle = substr(str, RSTART+1, RLENGTH-2)
    gsub(FS, sep, middle)
    str = sprintf("%.*s%s%s", RSTART-1, str, middle,
      substr(str, RSTART+RLENGTH))
  }

  if (index(str, "\"")) {
    return ((getline) > 0) ? setcsv(str (RT != "" ? RT : RS) $0,
sep) : !setcsv(str "\"", sep)
  } else {
    gsub(/\035/, "\"", str)
    $0 = str

    for (i = 1; i <= NF; i++)
      if (match($i, /^"+$/))
        $i = substr($i, 2)

    $1 = $1 ""
    return 1
  }
}

Nelson,
Per my previous reply, your sample led me to believe the following:
1) Every non-numeric field would be surrounded by double quotes.
except for "william".
2) Every numeric field would not be surrounded by double quotes.
If the specs give reason to question, no final solution will be found.

Shell,
this is a typical CSV conversion task - you cannot make any assumptions of what type of FIELD you're dealing with.
The only assumptions you can make are:

  1. This is a CSV-type file with ',' as a field separator
  2. There can be embedded ',' characters (plural) in any field
  3. some fields might be quoted with double-quotes

Sorry Shell, I tried to come up with every situation in my head, but missed that one scenario. I'm trying to come up with a solution without having an actual test file from my customer yet, trying to be proactive.

Also, that last reply using nawk worked great. Thanks for all your help.