Need Suggestions to improve Perl script for checking malformed braces/brackets

Hi all,

I've written a Perl script below that check and report for malformed braces. I have a UNIX ksh version and it took a couple of minutes to run on a 10000+ lines. With the Perl version it only took about 20 seconds so that is enough incentive for me to go Perl not to mention that I need to run this on a Windows as well later.

The script and the file testfile01x.ora sample file are attached. I've renamed the testfile01x.ora to testfile01x.txt as the forum does not accept .ora extension

Sample output of the run as below:

./check_malform.pl testfile01x.ora
Checking -> START = 1 :: END = 13 :: TNS = prod10.checks.com.ph = = MALFORMed !!!
Checking -> START = 14 :: END = 26 :: TNS = prod11.checks.com.ph = = OKAY
Checking -> START = 27 :: END = 39 :: TNS = prod20.checks.com.ph = = OKAY
Checking -> START = 40 :: END = 52 :: TNS = devd11.checks.com.ph = = MALFORMed !!!
Checking -> START = 53 :: END = 65 :: TNS = tstt11.checks.com.ph =  = OKAY
Checking -> START = 66 :: END = 78 :: TNS = devd1.checks.com.ph =  = OKAY
Checking -> START = 79 :: END = 87 :: TNS = quald_app.checks.com.ph = = OKAY
START  = Thu Jul 28 16:24:02 NZST 2011
FINISH = Thu Jul 28 16:24:02 NZST 2011

I want to know if anyone have any advise on how to "improve" the code somehow? The code works as it is the way I want it to but maybe there is a better way around it.

A short explanation of what the script does is as below:

  • The script is run as check_malform.pl testfile01x.ora. testfile01x.ora is the file to check for malformed braces
  • The script remove all lines that begin with # and all blank lines
  • It selects all lines that begins with an alphabetic character and place its start and end into a tmp file
  • It reads the tmp file and select some portions of the file and place these lines in another tmp file.
  • It checks the second tmp file and count the number of open ( and close ) braces
  • If the number of open and close braces do not match, then it is reported as MALFORMED

Main "improvement" that am hoping for is being able to read specific lines of a file just like the UNIX sed command. Removing the blank lines and comments seems unnecessary but did it just the same just to rid the files of things that am not interested in.

I may want to expand the script at some stage to cover other malformed characters, for example, < and >, { and }, [ and ] etc. but for now, am happy with just ( and ).

Tried a bracket checker from [Perl] bracket checker - Pastebin.com but it is giving me errors as below when I ran it:

./123.pl testfile01x.ora
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xab) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xab) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xab) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xab) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xab) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xab) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Malformed UTF-8 character (unexpected continuation byte 0xbb) at ./123.pl line 8.
Sequence (?|...) not recognized before HERE mark in regex m/(?| << HERE
    (\()(?&matched)([\}\]..�?>�??]|$) |
    (\{)(?&matched)([\)\]..�?>�??]|$) |
    (\[)(?&matched)([\)\}..�?>�??]|$) |
    (.)(?&matched)([\)\}\].�?>�??]|$) |
    (.)(?&matched)([\)\}\].�?>�??]|$) |
    (�)(?&matched)([\)\}\]..?>�??]|$) |
    (?)(?&matched)([\)\}\]..�>�??]|$) |
    (<)(?&matched)([\)\}\]..�?�??]|$) |
    (�)(?&matched)([\)\}\]..�?>??]|$) |
    (?)(?&matched)([\)\}\]..�?>�?]|$) |
    (?)(?&matched)([\)\}\]..�?>�?]|$) | at ./123.pl line 34.

Any suggestion will be much appreciated. Thanks in advance.

try this module

http://search.cpan.org/~jhi/perl-5.8.1/lib/Text/Balanced.pm