Help!!! Shell script to parse data file.

I am faced with a :confused: tricky problem to parse a data file ( May not be a tricky problem to the scripting guru's ).

Here is what I am faced with. I have a file with multiple rows of data and the rows are not of fixed length. "|" is used as a delimiters for individual columns and each row of data has 5 columns.

I have tried `cat data.out` and then looping through to read the lines, but since there are spaces, I am not able to read entire line of data and I am failing to read individual records.

How can I parse this data file and extract individual records? I have attached the data file to this post and also copied few records from the data file below.(at the end of this posting).

Any help/suggestions is highly appreciated.

Thank you,
Ajay.

|test.txt

                         |2008-7-2.19.19. 0. 

162000000|/default/main/administration/STAGING/Configuration/TeamSite/local/config

                                  |/opt/iw-home/TeamSite/local/config                                         
                                                                                                              
                                                               |34                    |

|test.txt

   |2008-7-2.19.13. 29. 529000000|/default/main/administration/STAGING/Configuration/TeamSite/local/config    
                                                                                                              
                                                              |/opt/iw-home/TeamSite/local/config             
                                                                                                              
                                                                                           |23                
|

|test.txt

                               |2008-7-2.19.23. 5. 

692000000|/default/main/administration/STAGING/Configuration/TeamSite/local/config

                                  |/opt/iw-home/TeamSite/local/config                                         
                                                                                                              
                                                               |11                    |

|test.txt

   |2008-7-3.10.30. 42. 912000000|/default/main/administration/STAGING/Configuration/TeamSite/local/config    
                                                                                                              
                                                              |/opt/iw-home/TeamSite/local/config             
                                                                                                              
                                                                                           |16                
|

|New File.txt

                                   |2008-7-2.19.19. 0. 

162000000|/default/main/administration/STAGING/Configuration/TeamSite/local/config

                                  |/opt/iw-home/TeamSite/local/config                                         
                                                                                                              
                                                               |22                    |

|New File.txt

       |2008-7-2.19.13. 29. 

529000000|/default/main/administration/STAGING/Configuration/TeamSite/local/config

                                  |/opt/iw-home/TeamSite/local/config                                         
                                                                                                              
                                                               |24                    |

I don't know whether ur data in the file is really of multiple rows (i mean 5 columns u r looking for), coz when i opened the file, it contains lot of spaces in it.
What i did is below and got the file to parse and convert it into comma delimited. You can do what ever processing you want.

awk -F'|' 'gsub(/ /,""){print $2 ","$3 "," $4 ","$5 "," $6}' testout1.txt

One of many ways to remove spaces:

 tr -d " " <testout>testout1

Above command will remove all spaces from testout and generate new file testout1.
Go ahead with any kind of processing.

To remove every blank might not be desirable because some values might contain blanks. In a first step i would remove the blanks trailing and leading the delimiter characters:

sed 's/ *|/|/g;s/| */|/g' file > file.new

I hope this helps.

bakunin