Hi,
Can you please help me?
How to Parse a comma delimited and optionally quotes dimilited file?
sample.dat
----------
"I",+2007,"SANDA, 20, MARTIN PLACE","SANDA 20MARTIN"
"D",+2008,"RANDA, 22, MARTIN PLACE","RANDA 22MARTIN"
Thank you.
Ram
Hi,
Can you please help me?
How to Parse a comma delimited and optionally quotes dimilited file?
sample.dat
----------
"I",+2007,"SANDA, 20, MARTIN PLACE","SANDA 20MARTIN"
"D",+2008,"RANDA, 22, MARTIN PLACE","RANDA 22MARTIN"
Thank you.
Ram
if you have Python and can use it, here's an alternative:
#!/usr/bin/env python
import csv
reader = csv.reader(open("file.csv", "rb"))
for row in reader:
print '|'.join(row)
# do whatever you want with row here..
output:
# python test.py
I|+2007|SANDA, 20, MARTIN PLACE|SANDA 20MARTIN
D|+2008|RANDA, 22, MARTIN PLACE|RANDA 22MARTIN
Hi,
Thank you for the solution.
But, I would like to know how to do it in a shell script.
Regards,
Ram
sed 's/\",/|/g' <filename> | sed 's/\"//g'
Im not sure of this...i have not tried..tell me if its workin
Perhaps change the field separator to something else, e.g. tab
awk '{
for (i = 1; i <= length($0); i++) {
x = substr($0, i, 1)
if (x == "\"") {
c++
}
if (x == "," && c % 2 == 0) {
x = "\t"
}
printf x
}
printf ORS
}' sample.dat > sample.tsv
There's a pretty detailed analysis of this problem in Friedl's book. It's somewhat more involved that Vijay's solution if you want to cover all the possible quirks of real-life CSV format. Here's one of the simpler approaches (from the first edition, sorry).
@fields = ();
while (m/"([^"\\]*(\\.[^"\\]*)*)",?|([^,]+),?|,/g) {
push @fields, defined $1 ? $1 : $3
}
push @fields, undef if m/,$/;
# @ fields now contains parsed line of CSV
You could do the same in sed or awk, although Perl makes some parts of it easier.
Vijay,
The following is the output.
$ sed 's/\",/|/g' sample.dat | sed 's/\"//g'
I|+2007,SANDA, 20, MARTIN PLACE|SANDA 20MARTIN
D|+2008,RANDA, 22, MARTIN PLACE|RANDA 22MARTIN
Not exactly the expected...
Thank you.
Ram
Hi era,
Sry. I can't use perl..
Thank you.
Ram
Hi Ygor(Moderator)..
Your solution is working...
Thank you..