Working with CSV files values enclosed with ""

I have a CSV file as shown below

"1","SANTHA","KUMAR","SAM,MILLER","DEVELOPER","81,INDIA"
"2","KAPIL","DHAMI","ECO SPORT","DEVELOPER","82,INDIA"

File is comma delimited.All the field values are enclosed by double quotes. But while using awk or cut, it interprets the comma which is present in text field (enclosed by "") as a seperate fields.

eg:

awk -F',' '{print NF}' File

above command will give output as 8 fields for first record,
7 fields for second record but actually it is 6

how can i neglect the comma which is enclosed in ".

It's easy with Perl:

perl -MText::ParseWords -nle'
  print parse_line(",",0, $_)+0;
  ' infile  

And with GNU awk >= 4:

awk '{ print NF }' FPAT='([^,]+)|("[^"]+")' infile
1 Like

Thanks for your response. could you please explain the awk command

Read Remove the values from a certain column without deleting the Column name in a .CSV file by RudiC - Shell Programming and Scripting - Unix Linux Forums and adapt to your needs.

You may try if awk < 4

$ cat file
"1","SANTHA","KUMAR","SAM,MILLER","DEVELOPER","81,INDIA"
"2","KAPIL","DHAMI","ECO SPORT","DEVELOPER","82,INDIA"
awk '      {
             column = 0
               $0   = $0","                                 
while($0)  {
             match($0,/ *"[^"]*" *,|[^,]*,/)
             substr($0,RSTART,RLENGTH)            
             ++column
             $0=substr($0,RLENGTH+1)                 
           }
             print column
           }
     ' file
$ sh tester.sh 
6
6

Also some ideas on Remove the values from a certain column without deleting the Column name in a .CSV file - Page 2 | Unix Linux Forums | Shell Programming and Scripting

It's explained in the GNU awk manual, check Defining Fields By Content.