Converting csv file to flat file

rohit_shinez · September 19, 2018, 11:47pm

Hi All,

I have a csv file which is comma seperated. I need to convert to flat file with preferred column length

country,id
Australia,1234
Africa,12399999

Expected output

country id
Australia 1234
Africa 12399999

the flat file should predefined length on respective columns. Here country - 1- 20 , id - 21-30

Don_Cragun · September 20, 2018, 12:27am

Given that the sample output you provided is not formatted at all like you said you wanted and, instead, just changes the <comma>s in your CSV file to <space>s; I don't know whether you want the input data to be left-justified in the output or right-justified.

Assuming that you want left-justified text in your twenty character and ten character output fields, two (of many) simple ways to do what you want include an awk script:

awk -F, '{printf("%-20.20s%-10.10s\n",$1,$2)}' file.csv

and a shell while read loop:

while IFS=, read c i
do	printf '%-20.20s%-10.10s\n' "$c" "$i"
done < file.csv

if you want right-justified text instead of left-justified text, remove both of the minus signs from the printf format strings in either of the above scripts.

rohit_shinez · September 20, 2018, 12:38am

Thanks don, Is there a way i can predine the columns length in shell, because here there are two columns. What if the columns are increased and the length needs to be defined accorindly instead of hardcoding every time

country,id,role
Australia,1234,engineer
Africa,12399999,doctor

role column will start from 31-35.

Don_Cragun · September 20, 2018, 12:58am

Of course you can do that. But, you'll need to be much clearer about what you mean by "length needs to be determined (sic) accorindly". I'm not at all clear about how you want to fit eight characters from engineer or six characters from doctor in a five character output field. (And you still haven't told us whether you want left-justified or right-justified text in fields that do not completely fill the output field.)

Note that in awk you can use length($1) to get the number of characters contained in field 1 on the current input line and in shell you can use ${#c} to get the number of characters contained the shell variable c . That should be everything you need to determine how many characters are in your widest input fields and to produce a format string that will give you appropriately sized output fields.

Why don't you try clearly specifying your output requirements, and try to write a corresponding awk or shell (or both) script to produce the output you want. If you can't make it work, show us where you get stuck.

rohit_shinez · September 20, 2018, 2:57am

Basically the text file have fixed length like country - 1-20,id 21-25,role - 26-30.
Country - will accept only 20 characters
id - 5 characters
role - 5 characters

Don_Cragun · September 20, 2018, 3:54am

Great. I can't wait to see what you come up with to meet these requirements. How you will fit 12399999 into a 5 character id field and how you will fit engineer and doctor into a 5 character role field is beyond me. Please explain how you determined that two fields need to be 5 characters wide when they need to hold values that contain more than 5 characters.

And, of course, you still haven't specified justification for values that don't completely fill a field???

Please show us the code you've come up with to meet these requirements!

rohit_shinez · September 20, 2018, 4:08am

Sorry to below example the lenght are fixed like
Country - will accept only 20 characters
id - 20 characters
role - 20 characters

country,id,role
Australia,1234,engineer
Africa,12399999,doctor

Don_Cragun · September 20, 2018, 2:13pm

rohit_shinez:

Sorry to below example the lenght are fixed like
Country - will accept only 20 characters
id - 20 characters
role - 20 characters
country,id,role
Australia,1234,engineer
Africa,12399999,doctor

OK. With what I showed you in post #2 in this thread, what other information do you need to complete this task on your own? I thought I gave you everything you needed there for cases where the data fields do not overflow the specified field widths (or you want to truncate input data if it is too long to fit the output field) whether you want left-justified or right-justified output.

shamrock · September 20, 2018, 4:34pm

rohit_shinez:

Thanks don, Is there a way i can predine the columns length in shell, because here there are two columns. What if the columns are increased and the length needs to be defined accorindly instead of hardcoding every time
country,id,role
Australia,1234,engineer
Africa,12399999,doctor
role column will start from 31-35.

You can use Don's example to cobble up something...and yes you can change the printf specifier so it'll take variable length fields instead of fixed length ones that may exceed the maximum specified field width.

Hint...lookup the "%.*s" printf specification.