csv to xls : missing rows

A unix script generates a file "1.csv". I use the following to email this as an excel sheet.

/usr/bin/uuencode /tmp/1.csv 1.csv > $PATH/attachment.txt
mailx -r abc@domain.com -s "Subject" myself@domain.com < $PATH/attachment.txt

I get the file as CSV in the email and everything is fine except the following missing rows problem:

PROBLEM: Some fields in the CSV have a quotemarks ( " ) because of which some records are missing in the CSV in the email, though these rows are present on the csv file on the unix box.
Also, if the csv email attachment is opened in a notepad, those rows can be seen.

Is there a way of getting those missing rows in the excel, please?

Regards,
Girish.

csv is csv, you didn't actually convert it to xls before mailing it. excel simply understands how to read csv.

Show an example of the data that didn't work.

Hi Corona,

Here is an example:

1,Jack,a,cdef,23,w1e2r3t4,"Martin             
.
.
30,Howard,a,bjoa,28,1m0df5s9,Joel             
.
.

43,Sunil,a,xhst,45,x4z3v8c9,"Girish

Notice that the last column of the 1st and the last row have a single quote mark. I think because of which all the rows between these two rows are missing in the email attachment when it is opened as an excel sheet. however, they are seen if opened in a notepad.

Would stripping out quotation marks be sufficient, or do they have to be preserved?

tr -d "\"'" < input.csv > /tmp/$$.csv
# mail the $$.csv

Or if a backslash before the quote will prevent excel from swallowing them:

sed 's/"/\\"/g' < input.csv > /tmp/$$.csv

Thanks for the suggestions but I tried this already and it does work. But unfortunately I need to keep the quote mark as it is ! :frowning:

Maybe it only understands backslashes inside quotes...

awk -v FS="," -v OFS="," '{ gsub(/"/, "\\\""); for(N=1; N<=NF; N++) $N="\"" $N "\"" } 1' file.csv > /tmp/$$.csv

which will turn the field a into "a", and the field "a into "\"a".

sed 's/\"/\\\"/g' a.csv >> a_tmp.csv
mv a_tmp.csv a.csv

Using the above

aaa,aaa,aaaaaa,aaaaaaaaa,aaaaaaaaaaa,"aaaaaaaaaaa (in CSV)

is seen as the following in notepad

aaa,aaa,aaaaaa,aaaaaaaaa,aaaaaaaaaaa,\"aaaaaaaaaaa (in CSV)

And in the excel sheet also it looks like the following:

aaa,aaa,aaaaaa,aaaaaaaaa,aaaaaaaaaaa,\"aaaaaaaaaaa (in excel)

But I want the following in excel sheet:

aaa,aaa,aaaaaa,aaaaaaaaa,aaaaaaaaaaa,"aaaaaaaaaaa (in excel)

The problem is that it seems a property of excel to put everything between 2 double quotes in one field or if a field value bigins with a single double quote, it automatically ends that field with a double quote as well or puts all rows after it into one field.

I think I might have found the solution though. If so, I will post it shortly. thanks again for your help.

Did you try my idea using awk? excel responds differently when everything's quoted, I think...

Sorry, I thought the solution is to replace " by """" in CSV. But that seems to not be the solution as the excel also shows """". I think I might have to go to some excel forums as this seems to be an excel problem ?

---------- Post updated at 04:00 PM ---------- Previous update was at 03:44 PM ----------

I get the following error:

awk -v FS="," -v OFS="," '{ gsub(/"/, "\\\""); for(N=1; N<=NF; N++) $N="\"" $N "\"" } 1' a.csv
awk: syntax error near line 1
awk: bailing out near line 1

try nawk or gawk.