fixed record length

george · March 27, 2006, 4:34am

hello!

I have a file with fixed record length...

format:

123445asdfg 4343777 sfgg

I wanna convert it to

123445,asdfg ,4343,777 ,sfgg

is there any way to do it?

sed/grep/awk??

at the moment I use sed -e 's_ $[^ ]$_,\1_g'
but it works only if there are spaces between records...

any idea to deal with??

thanks for any help.

gauravgoel · March 27, 2006, 4:46am

it seems you know awk,
so you can use substr function of awk,
do man awk for more details

Gaurav

Klashxx · March 27, 2006, 4:59am

echo "123445asdfg 4343777 sfgg"|awk '{print substr($1,1,6)","substr($1,7,11)" ,"substr($2,1,4)","substr($2,5,7)" ,"$3}'

123445,asdfg ,4343,777 ,sfgg

george · March 27, 2006, 5:06am

thanks a lot!

is there a more generic way to do it?
instead of $1/$2....to parse the whole line?
becouse like this I will have to change the script each time I use it for another file.

thanks for any advise.

gauravgoel · March 27, 2006, 6:23am

if you are sure that the no. of characters in the string is going to be constant, then you can first use sed to remove the blank spaces fron it and then use the awk substr() fn to print the required output.

post back if still some confusion
Gaurav

george · March 27, 2006, 6:41am

it doesn't solve the problem...

in each file I have to change
substr parameters
instead of substr($1,7,11)" ,"substr($2,1,4)","substr($2,5,7)"

I want substr($the_whole_line,7,11),substr($the_whole_line,12,17)

ls that possible?

thanks.

george · March 27, 2006, 6:43am

fields in a record which is lets say 30 chars another is 40 may contain spaces...

Klashxx · March 27, 2006, 7:31am

while read LINE
do
   echo |awk -v LIN="${LINE}" '{print substr(LIN,1,6)","substr(LIN,7,5)" ,"substr(LIN,13,4)","substr(LIN,17,3)" ,"substr(LIN,21,4)}'
done < INPUT

george · March 27, 2006, 7:45am

thanks!!!
it rocks

how I will use it like this:
instead of
echo |awk -v LIN="${LINE}"
to use plain $0
'{print substr($0,1,6)","substr($0,7,5)" ,"...

I guess the problem will be with field seperator -F" ? "

any idea??

thanks!!

Klashxx · March 27, 2006, 7:50am

try

awk -v LIN="${0}"  ........etc

Bye.

george · March 27, 2006, 8:22am

I get some strage results!

awk -v LIN="${0}" '{print substr(LIN,1,6)","substr(LIN,7,5)" ,"substr(LIN,13,4)","substr(LIN,17,3)
" ,"substr(LIN,21,4)}' < file.txt > result.txt

why?

gauravgoel · March 28, 2006, 12:39am

what I meant there is that if the no. of characters excluding spaces is constant, then you can first use sed to remove the spaces.
After that use
awk '{print substr($1,7,11) substr($1, m,n)........}' filename

once you remove the blank spaces , there will be only one field in each record so you can easily use substr on it.

Gaurav

george · March 28, 2006, 2:32am

<what I meant there is that if the no. of characters excluding spaces is constant

including spaces is constant

gauravgoel · March 28, 2006, 3:04am

can you post example data. With atleast two different formats and the expected output in each case

george · March 28, 2006, 4:11am

000000382019-0001-01john stewart 056628826 //..more fields
000000589219-0000-21joe warren //...more fields

should be :

0000003820,19-0001-01,john stewart ,056628826 //...more fields
000000,5892,19-0000-21,joe warren ,//...more fields

000000032301-0001-21karla 112 01

000000032601-0004-21admin 15.0 01

should be:

0000000,323,01-0001-21,karla ,112 ,01 ///...more fields

0000000,326,01-0004-21,admin ,15.0 ,01 ///...more fields

gauravgoel · March 28, 2006, 4:31am

for me if the data format could vary so much, you need to modify your script for every file I can't see how this could be made more generic.
The only thing is you can use $0 for parsing the whole line, but you will always have to change other parameters for the substr function

Gaurav

george · March 28, 2006, 4:41am

yes,I know...$0 and change substr(...,...,...)....

but is ok