Methods For Debugging Perl Problems

Note: Not a programmer by profession but occasionally have to program.

I am looking for general methods and freely/readily available tools employed to debug problems during development of perl scripts. Anything that has really helped you out with problems you just couldn't find.

A couple of problems I'm seeing:

  • apparent corruption of $[0], $[1], etc
  • regexp failures that should be matching

I have a lot of subroutines that I unit tested and found all to be working as expected. When I put everything together into a single script, I noticed unexpected results.

For the $_[0] case, I put in debug statements to see how the subroutine was executing and noticed that the value was correct on entry into the routine but then, when processing, it seemed as if it had been truncated (based on how the routine behaved). After I couldn't really figure out what was happening to it, I copied the parameters to temporary variables on entry to the subroutine and then used those variables throughout. And that solved that problem. But again, if I isolate the subroutine it works fine without the temporary variables.

I have a long list of regexps I'm searching for and I know that the expressions, themselves are good. Everyone of them was tested to verify that it was matching what it was supposed to match. But with everything together, the regexps are failing. And I'm talking even simple regexps... like just matching an 11-digit number. More complicated ones are matching and then, on the same line, simple ones are failing to match. During my testing, I tested with multiple expressions on the same line and there was no problem so I don't think it has anything to do with how many
are on the line. But I don't know. I can't understand why the expressions are failing to match. Would it have anything to do with pos? One thing I did change (some time) before I ever noticed any problem was that, once I found a match, I perform a global substitution. I don't think that should be the source of this problem (but I will take out the global substitution in just a minute, just to see) as I am not doing global matches to try to identify the expressions and each attempt at a match uses a new block.

Any debugging suggestions would be greatly appreciated.

------------------------------------------------------------

Edited to add:

I removed the global substitutions and it made no difference in the regexp matching. Some lines aren't even matching any of the regexps that are there so the global substitution never would have even come into play in those cases.

Another thing I just tried... inside my loop where I read the data file, I manually set the line variable to one of the lines in the file (thus ignoring the actual read-in lines) and the matches that are failing when the data is read from the file match this way. The fact that I can take the same file data and do a manual assignment and it will succeed, but it fails when actually reading the file, makes no sense because I just copied/pasted the data so I'm fairly sure there's nothing wrong with the data file.

----------------------------------------------

Edited to add:

It just keeps getting better. Now I deleted the direct assignment and tried the file again. No joy. But then I put the direct assignment back into the loop and it no longer matches there either. Guess it's time to pull it all apart once again and have another go at it.

Write simple programs to test regex! Well, and anything else. It s a quirky language, but not that unstable!

There may have been more than one thing happening in the program but I managed to get past the problem without really knowing what the issue was.

As I said, I had unit tested all of the components of the script and then I put them all together and added all the control. Turns out that (control) was where the problem was.

There wasn't anything wrong with the regexs (other than the fact that the nature of the data leads to false positive matches). The script processed several types of files mapping customer identifiable info to non-specific tokens. So I just decided to kind of start over. I started out processing only 1 type of file and commented out all the rest of the code that had nothing to do with that specific file type. What I found surprised me. There was a block with a LABEL/next LABEL construct and that was going off the rails. I converted it into an until block and the program executed as I had expected it to. Interestingly, a LABEL/last LABEL construct block works fine.

You can as well use the perl debugger for line by line execution like the gdb.

perl -d <perl script>

regards,
Ahamed

If you write code that trusts no input, checks and logs errors, leaves nothing to chance but does something for every logical possibility, then your challenges will be small enough for debug statements or side code tests. You, too, deserve a nice error log, one per run or with headings and exit trailers. That is why there is always an fd 2 stderr 2>log_file in UNIX.

I mostly use gdb for core dumps in c/c++ dev, to get a stack trace.

For apps with lots of system calls, strace/truss/tusc is great, can even be used when code is already running, when you have no source! Sometimes you can option it to tell you every library call in C, which you can relate to PERL actions.

Good luck.

Hi.

I try to follow the guidelines in:

Title: Perl Best Practices
Subtitle: Standards and Styles for Developing Maintainable Code
Author: Damian Conway
Date: 2005
Publisher: O'Reilly
ISBN: 0596001738
Pages: 500
Categories: perl, standard, development, scripting, programming
Comments: 4.5 stars (39 reviews, 2011.08) at Amazon.

See O'Reilly page: Perl Best Practices - O'Reilly Media

Amazon comments, review: Amazon.com: Perl Best Practices (9780596001735): Damian Conway: Books

Specifically:

Chapter 18 Testing and Debugging
   1.      Test Cases
   2.      Modular Testing
   3.      Test Suites
   4.      Failure
   5.      What to Test
   6.      Debugging and Testing
   7.      Strictures
   8.      Warnings
   9.      Correctness
  10.      Overriding Strictures
  11.      The Debugger
  12.      Manual Debugging
  13.      Semi-Automatic Debugging

cheers, drl