Converting Tab delimited file to Comma delimited file in Unix

Hi,

Can anyone let me know on how to convert a Tab delimited file to Comma delimited file in Unix

Thanks!!

You can use sed to convert the Tab delimited file to a comma delimited file

/export/home/test/mona>cat tab_del.dat
1       2       3       4       5
6       7       8       9       10
/export/home/test/mona>sed 's/        /,/g' tab_del.dat
1,2,3,4,5
6,7,8,9,10

Redirect the output of this to another file and rename it.

When i use

/export/home/test/mona>sed 's/ /,/g' tab_del.dat

I m getting like this..
1,, 2 3 4 5,
6,, 7 8 9 10,

Where i need like this
1,2,3,4,5
6,7,8,9,10

Plz help me out on this..

Thanks!!

Hi,

In the below code, the integers are seperated by tab and not multiple spaces.

The same way in the sed command you should give a tab and not multiple spaces. I think problem occurs when you copy the below contents and run. Don't copy the contents, form the data file of your own and run the sed command.

Let me know if you still face any problem

/export/home/test/mona>cat tab_del.dat
1press the tab key here2       3       4       5
6       7       8       9       10
/export/home/test/mona>sed 's/        /,/g' tab_del.dat
1,2,3,4,5
6,7,8,9,10

Use tr

tr '\t' ',' < tab-delimited-file > comma-delimited-file

i think it is simpler to convert using tr. You can use sed but sed will not recognize escape sequences like \n,\t.. but all these will be accpeted in tr.

Great going Vino

Vino..No Luck..

Let me tell u my requirement..

I have a file dmlog.txt with data like this

LOAD_STAGE STAGE_LOAD COMPLETE U0 10-OCT-05
LOAD_STAGE_A D_CA_PRODUCT COMPLETE U0 10-OCT-05
LOAD_STAGE_B D_CB_PRODUCT COMPLETE U0 10-OCT-05

Now i need to convert it to

LOAD_STAGE,STAGE_LOAD,COMPLETE,U0,10-OCT-05
LOAD_STAGE_A,D_CA_PRODUCT,COMPLETE,U0,10-OCT-05
LOAD_STAGE_B,D_CB_PRODUCT,COMPLETE,U0,10-OCT-05

Can u explain me on how to proceed with this Example.

Its a Tab Space between data in dmlog.txt

LOAD_STAGE<tab><tab>STAGE_LOAD<tab>COMPLETE<tab>U0<tab>10-OCT-05

to be converted to

LOAD_STAGE,STAGE_LOAD,COMPLETE,U0,10-OCT-05

Help me out in this...

Thanks!!

Is there an extra tab between LOAD_STAGE and STAGE_LOAD.

What output you are getting with the tr command?

Can you please try the below command,

cat -vet will display all the control characters. If you notice, there are two ^I between the first and second column. The other columns will have only one ^I.

^I represents a tab here.

/export/home/test/mons>cat -vet tab_del
LOAD_STAGE^I^ISTAGE_LOAD^ICOMPLETE^IU0^I10-OCT-05$
LOAD_STAGE_A^I^ID_CA_PRODUCT^ICOMPLETE^IU0^I10-OCT-05$
LOAD_STAGE_B^I^ID_CB_PRODUCT^ICOMPLETE^IU0^I10-OCT-05$

#sed -e 's/<press two tab in the below command>/<press one tab here>/' -e 's/<press one tab here>/,/g' 
/export/home/test/mons>sed -e 's/               /       /' -e 's/       /,/g' tab_del
LOAD_STAGE,STAGE_LOAD,COMPLETE,U0,10-OCT-05
LOAD_STAGE_A,D_CA_PRODUCT,COMPLETE,U0,10-OCT-05
LOAD_STAGE_B,D_CB_PRODUCT,COMPLETE,U0,10-OCT-05

Assuming that your delimiters are only tabs

sed -e 's_\(\t\)\{1,\}_,_g'

Guys..No Luck..

In my txt file dmlog.txt , Consider 100 records where with in a record
fields are separated with tab spaces. In some records fields may be 2 tabs
separated and in some 1 tab separated.I believe we need to convert all the
2 tab spaces into 1 tab in all the records then convert that single tab space into comma (,).

Kindly help me in this..

The above works well for me.

[/tmp]$ cat tab
1               2       3       4       5       6
7       8       9       10
11              12                      13
[/tmp]$ cat -t tab
1^I^I2^I3^I4^I5^I6
7^I8^I9^I10
11^I^I12^I^I^I13
[/tmp]$ sed -e 's_\(\t\)\{1,\}_,_g' tab
1,2,3,4,5,6
7,8,9,10
11,12,13
[/tmp]$ 

All those empty spaces you see are tabs.

Your way

tr -s '\t' < dmlog.txt | tr '\t' ','

The below sed command does that. Did you try this?

sed -e 's/               /       /' -e 's/       /,/g' tab_del

you can try this if the tabs are going to 1 or more than 1 between fields...


sed 's/        [         ]*/,/g' tab_del

after s/.... there is <tab> ( not spaces ) and within [] there is one more tab ( not spaces )....

this will replace any number of tabs to single comma...

Guys...

Could u people let me know on how to find whether
Spaces between the fields in a record is a "Tab" or "Multiple Spaces".

/export/home/test/mons>cat -vet hh
adasd^Isdfsdf^I^Isdfsdfdsf$
dsfdsf sdf sdfsd sfsdfsdf$

cat -vet will display all the control characters

^I represents tab

and space is shown as " " only.

when i issue cat dmlog.txt i get

ANALYZE DLS_DM COMPLETE GE 24-OCT-05
LOAD_STAGE STAGE_LOAD COMPLETE C0 16-DEC-05

when i issue cat -vet dmlog.txt i get

ANALYZE DLS_DM COMPLETE GE 24-OCT-05 $
LOAD_STAGE STAGE_LOAD COMPLETE C0 16-DEC-05 $

Is that mean only spaces are there b/w fields ?? I dont see any ^I characters..

Here is what i get...

when i issue cat dmlog.txt i get

ANALYZE DLS_DM COMPLETE GE 24-OCT-05
LOAD_STAGE STAGE_LOAD COMPLETE C0 16-DEC-05

when i issue cat -vet dmlog.txt i get

ANALYZE DLS_DM COMPLETE GE 24-OCT-05 $
LOAD_STAGE STAGE_LOAD COMPLETE C0 16-DEC-05 $

Is that mean only spaces are there b/w fields ?? I dont see any ^I characters..

If so,

How to convert those spaces into comma ??