Reading a FLAT File - No Delimeters

Hi Folks,

I have a file without any delimeters and it is a flat file. Example,

my raw data looks: x25abcy26defz27ghi.....

Now, could you please any one help me to program to split this into variable and create a text file. I want a output as below

Name Age Number
x 25 abc
Y 26 def
z 27 ghi

on linux,

[ ~]$ echo 'x25abcy26defz27ghif33jjj' | sed 's/\([a-z]\)\([0-9][0-9]*\)\([a-z][a-z][a-z]\)/\1 \2 \3\n/g'
x 25 abc
y 26 def
z 27 ghi
f 33 jjj
 
[ ~]$ sed --version
GNU sed version 4.1.5

On HP UX,

/home/->echo 'x25abcy26defz27ghif33jjj' | sed 's/\([a-z]\)\([0-9][0-9]*\)\([a-z][a-z][a-z]\)/\1 \2 \3\
/g'
x 25 abc
y 26 def
z 27 ghi
f 33 jjj
 
/home/->

Thanks Ankel. However, I have given only the sample data. My first column has 16 digit card number and so on. Your sed command considers for a single character wise.

Please help how to proceed.

Please tell us how to differentiate Name and Number.
Else, it will not be able to parse these records

Although below code does not do the job, it should be able to point you to the right direction once you can differentiate between Name and Number. Some regular expression will be required to sort this out

echo x25abcy26defz27ghi | sed 's/\([a-zA-Z]*\)\([1-9][0-9]*\)\([a-zA-Z]*\)/\1,\2,\3\n/g'

As chihung told, you can easily modify the regexp as per your requirement. or show us the exact rule to determine how and when splitting is required.

sounds like you have a 'fixed width' field records.
do you know the widths of all the fields?

Ok, Here is the original record..

1-16 is digit cardno
17-33 is digit alternatecardno
34 is string
35-47 is alphanumeric..

like that it has 15 columns with 105 length. I know only the starting position and ending position of these 15 variables.

As well as the second raw without any delimeters it starts from 106 position of the file.

Let me know if you need further clarifications..

---------- Post updated at 07:30 PM ---------- Previous update was at 07:28 PM ----------

Yes VGersh. You r right. Please verify my last reply and help me.

Again, what about the width of the fields after 4th field?
or you only need these first four field from each 'raw' ?

Do you want something like this?

echo '12341234123412345678567856785678S881234rrrrr23E'| awk '{print substr($0,1,16),substr($0,17,16),substr($0,34,1),substr($0,35,13)}'

I don't think that any solution involving "sed" reading the file directly will work because there are no linefeed characters in the file.

# Assuming fixed length records of 6 characters
cat filename|fold -w6|while read line
do
        part1=`echo "${line}"|cut -c1-1`
        part2=`echo "${line}"|cut -c2-3`
        part3=`echo "${line}"|cut -c4-6`
        echo "${part1} ${part2} ${part3}"
done

Hi Anchal,

I have 15 columns and I just gave sample of 5 fields. I think you code will work for first record. But, what about the next record and so on.

Because, the file is a sequential file.

Use the forum's 'Search' capability and search for 'FIELDWIDTH'.
One such thread is 'here.

Easier to use cut

echo 1234567890abcdef | cut -c1-6

However, if you have a lot of record to process, you should avoid forking too many processes. Try to use awk to print per line. See this example:

echo 1234567890abcdef | awk '
{
s4_6=substr($0,4,3)
print s4_6
}'

Why don't you provide at least one complete, unabbreviated record followed by exactly how it should be output. Also, take a moment to consider if there are any special cases that would require special handling. If there are any, include one complete, unabbreviated record for each of them.

Please keep in mind, for future help requests, that it would have saved everyone (including yourself) a lot of time if you had done this from the start.

Regards,
Alister

Well said alister. Posting the complete problem will usually lead to a quick solution - as experienced Systems Administrators know.

Btw & imho. Post #7 is gibberish. Writing software to process this data based on the (latest) information supplied is impossible.

I am extremely sorry for not providing the exact input. Since, I work for outsourcing I could not provide the data and copy from the client location to local system. However, I will try to provide the sample in same manner in my next reply.

Thanks to all for your hard work.