regular expression with shell script to extract data out of a text file

hi
i am trying to extract some specific data out of a text file using regular expressions with shell script

that is using a multiline grep .. and the tool i am using is pcregrep so that i can get compatibility with perl's regular expressions

for a sample data like this, i am trying to grab the details of companies namely

  • company name
  • po box
  • Tel
  • fax
  • mobile
  • company profile

into a .csv file
i am new to regular expressions and linux too..
all i could manage to get was something like this

\[\d*\][^\.]*[\(\d*\)\s\d*)]

can anyone help me out with this please..

How to get company name? If it begins with the numbers in square brackets, The below doesn't seem to be a company name

[75]Upgrade this free listing here

i had the same problem buddy...
i tried the regular expressions.. i've been doing that since yesterday.. but in vain!
and thats the reason why i posted it here!

Ok, Does this seems to work for you?

$ awk '/^\[/ && ! /Upgrade this free listing/ {print $0} /:$/ && ! /Classification/ {printf $0 ;  getline x ; print x}' file
[58]Walid Chamoun Architects WLL
PO Box:55803, Doha, Qatar
Location:D-Ring Road, New Salata Shamail 40, Villa 340, Doha, Qatar
Tel:(00974) 44568833
Fax:(00974) 44568811
Mob:(00974) 44568822
[65]Al Ali Consulting & Engineering
PO Box:467, Doha, Qatar
Tel:(00974) 44360011
[69]Al Gazeerah Consulting Engineering
PO Box:22414, Doha, Qatar
Tel:(00974) 44352126
[73]Al Murgab Consulting Engineering
PO Box:2856, Doha, Qatar
Tel:(00974) 44448623
$ 

the data looks good but when i tried using the awk command you posted, it gave me only po box numbers, fax numbers and phone numbers

something like this!
and i need a separator so that i can save it as a csv file

and this is the file i am working on...
check the attachment

I used the sample you posted above. Are you sure to use file having valid patterns?
For company name, I used the trick, when

PO Box fields already contains some commas. You must use some other separator. But that is the later part. first we need to get the correct output.

EDIT : Ok, I saw the next post. I will check the attachment.