Generate and copy files in UNICODE format from Linux to Windows

Hi,We have an interface which extracts data from database tables, spool the records into text files, and copy those files using SFTP to a windows server location. The target system loads these files into its database using some DTS packages in SQL Server.

This interface of exporting text files is being implemented using a shell script.

The problem we are facing is with the format of the file. The file should be in UNICODE format as per the target system requirement. Also, the line endings of each line in the text is not proper when it is opened in windows machine.

The records did not get loaded as they are not separated by Newline character. As there is no new line character, all records are being assumed as one and only first record is getting loaded into the database. Also, the files need to be in UNICODE format.

Please let me know what specific commands/settings needs to be done in the shell script to make sure the files are generated in UNICODE and transferred properly which can be used in the windows server machine.

Thanks a lot for your help.

use unix2dos or ux2dos before sftp the files. look at the man pages for details.

Please note, the availability of these programs varies with the OS. unix2dos works on Solaris/Linux while ux2dos works on HP-UX and AIX

Hi,
Thanks for the information.

Can you please provide any sample script how to use unix2dos program?

Again, You need to check man page.
the usage are not same for all OS.

I am on HP-UX machine. It writes to the STDOUT

ux2dos file > newfile

use newfile for further processing (SFTP)

on linux, I remember, unix2dos file saves the changes in file.

Hi,
I have used unix2dos to convert the files to use proper line endings. However still the file format shows as ASCII text. I have used file <filename> command to check this.
Can you please help me in converting the files to UNICODE format?
Thanks

Hi,
Can anyone help with the fixing above issue?

Are you sure you are using text mode in sftp and not binary mode?

i have not specified the mode explicitly....what is the default mode?

After converting i.e. unix2dos, did you try ftp-ing the file to windows and opening it?

--ahamed

yes ahamed...i have s-ftp-ed the file after converting it using unix2dos...but the files is in ASCII mode while opening in windows server....

Hi,
Any pointers on how to spool files in unicode format using shell script?

so assuming you want to spool the files from a SQL DB in unicode, (disclaimer:am no sql expert, found this on the interweb, so YMMV). Try this. Set this in your shell and try spooling the files. Also ensure the locale settings in your shell are set to en_US.UTF-8

export NLS_LANG=AMERICAS_AMERICA.AL32UTF8

Assuming you mean UTF-8 (not UTF-16) basic ASCII is the same as UTF-8. However if you have any special characters (like accented foreign characters) in your data they would be represented in two or more bytes in UTF-8.

A text file FTP will remove the high order bits from ASCII special characters. If you file contains special characters it will require conversion on the unix server followed by binary mode FTP.

My advice would be to get the database extract program to write the file in the correct format rather than mess about trying to fix it with Shell.

Some Linux servers store all text data in Unicode format. You don't mention what specific Operating System you have.

Hi,
Thanks for the reply.
We are using sftp to transfer the files, hence it will be binary mode transfer.
Below is the OS details on which this script runs.
Linux ip-10-49-29-179 2.6.18-194.0.0.0.3.el5xen #1 SMP Mon Mar 29 18:27:00 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux