Replace double quotes using awk

I have the below sample file each field seperated with comma - need to extract the 5th field which is "txt4 "(tst)"" and replacing the double quotes by doubling the quotes within that field i.e. "txt4 ""(tst)"""

Need to achieve this o/p using awk

sample csv file

"this is txt1","this is txt2",3,"this txt3","txt4 "(tst)""

desired o/p

"this is txt1","this is txt2",3,"this txt3","txt4 ""(tst)"""

Welcome back @reach2khan.

This is quite tricky. This might work if there are never commas inside double quotes, otherwise you would need something else..

awk '
  {
    for(i=1; i<=NF; i++) {       # for every comma-separated field:
      gsub(/^"|"$/,RS,$i)        # Replace the leading and closing double quote by a newline
      gsub(/"/,"&&",$i)          # Replace remaining quotes by 2 double quotes
    }
    gsub(RS,"\"")                # Turn the newlines back into the original double quotes on the whole line
    print                        # Print the line
  }
' FS=, OFS=, file
2 Likes

a somewhat similar technique (with the same disclaimer caveat) if you have one of the most recent gawk-s with the FPAT support . Adjusted based on the gawk manual example
Here's the test harness

BEGIN {
  qq="\""
  FPAT = "([^,]+)|(\"[^\"]+\")"
}
{
  print "NF = ", NF
  for (i = 1; i <= NF; i++) {
    gsub(qq "[^"qq"]+"qq qq, qq "&" qq, $i)
    printf("$%d = <%s>\n", i, $i)
  }
}
2 Likes

Hello

In following example, I use octal version of double quotes, aka '\042'
(very useful to deal with quotes.. and some other special caracters :wink: )

assuming your string(s) is/are in file 'file':

$ awk -F',' '{sub("\042[(]","\042\042(",$5);sub("[)]\042",")\042\042",$5);print}' file

"this is txt1" "this is txt2" 3 "this txt3" "txt4 ""(tst)"""

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.