perl limitations vs. bash?

unclecameron · May 12, 2010, 11:22am

I've building a bunch of bash scripts, and am thinking about "converting" to perl, and have a couple questions first:

Is there anything bash will do that perl won't?
How steep is the learning curve?
If perl's more powerful, why?
I've built a small app in python, which seemed nice, isn't there a wider support depth (i.e. example code, # of users)in perl than python?
I'm building mostly sysadmin scripts, if I were going to do more web integration (think php/javascript) with my code, would perl or python be easier to build, and which would be more powerful, or are they roughly the same?

This isn't meant to be a troll, I really would like to continue learning, and you guys have been very helpful learning bash, awk, sed, grep, etc. on multiple occasions

Corona688 · May 12, 2010, 11:39am

Not really. perl's a more "complete" language. bash is better oriented to scripts.

Almost vertical.

It's a lot more extensible and supports complex data structures.

Perl support seems to be spotty, either very good or very bad. It may be I just never found the right place.

$ strace bash 2> bash-log
^C
$ strace perl 2> perl-log
^C
$ strace python 2> python-log
^C
$ grep '^open' python-log | wc
    149    1172   13744
tyler@mecgentoo ~ $ grep '^open' perl-log | wc
      9      36     399
tyler@mecgentoo ~ $ grep '^open' bash-log | wc
      7      28     323
tyler@mecgentoo ~ $

Python tries openening 149 files to do nothing at all. For scripts that run once and terminate, python's an absolute pig. For things that stick around, that's tolerable.

wpeckham · May 12, 2010, 5:10pm

I cannot compare Python, as I do not use snakes here.

Bash is easy and (in version 4) remarkably complete for a shell. Shell scripting starts with automating exactly what you would do on the command-line, so testing is easy. Scripting in a shell usually requires calling external utilities as tools to complete what that shell cannot. My BASH scripts often call ls, rm, grep, find, mv, awk, tr, sed, or other tools.

Perl is far more complete. The syntax is different, and less likely to match what you would do on the command-line. One of its great advantages is that every tool you would call in a script is either built into the Perl interpreter, or is a function that is simple to recreate in Perl. The result is that your Perl script only acts on data, and does not have to call external tools.

I enjoy scripting many things in Bash: in particular scripts that start, stop, or signal services. Since these scripts deal with external services, there is no advantage in using Perl. When I automate something that has only to act and can act faster if external calls are eliminated, I use Perl.

I recommend both, and have never run into a support problem with either. The learning curve is TERRIBLY steep if you feel you have to MASTER perl, because there is a LOT of it! If you only need enough to solve problems in a better, more efficient manner than you can with BASH, then the learning curve is irrelevant: just learn enough to do the job. After using Perl for a few jobs you will find yourself feeling and performing like an expert.

Once you begin learn perl, you will find yourself coding awk, sed, and tr filters far less often. You will even call Perl one-liners from your Bash scripts because they will seem faster and easier to implement.

No language or tool is a total answer to all problems, but the common description of Perl is the "Swiss Army Chainsaw": it cuts all of your problems down to size. You WANT it in your toolbox, you just do not know yet how much.

deindorfer · May 12, 2010, 5:41pm

Perl can do everything BASH can do a lot faster. Perl can also do many things which BASH cannot, e.g. connect to a legion of RDBMS implementations seemlessly with DBI, and support full OO application development.

Many feel it is quite sharp. This is a good place to start: Amazon.com: Learning Perl, 5th Edition (9780596520106): Randal Schwartz, Tom Phoenix, brian d foy: Books

Perl's has stronger standard data types, excellent support for references and complex data structures, a rich set of built-in functions and superb support for regular expressions.
Perl is much faster than BASH ( but then, so is almost everything else. Time how long it takes both to count to 500,000,000, or to compute a factorial ).
Perl has unparalleled 3rd party support with external libraries available from CPAN ( the Comprehensive Perl Archive Network ). For Instance, want to encrypt something with the Whirlpool algorithm? Just use Digest::Whirlpool right out of the box. You would not be so lucky in BASH, Python, or PHP. i.e. you'd be implementing the algorithm on your own.

There are 115 modules in the Digest:: space and 409 modules in Crypt:: compare that to PHP PEAR for instance. It's not even close.

Compare CPAN to Python's Python Package Index : PyPI and you are likely to find CPAN superior. Python really does not have a community-accepted one-stop-shop for all serious 3rd-Party Modules. PHP Pear is much more serious, but it still fails in comparison to CPAN. Knowledge of the pre-existing code on CPAN can make writting applications in Perl Geometrically faster, even without a framework.

Frameworks are where Perl has been weak ( compare, Python Django, Ruby on Rails, or PHP Zend ), until the advent of Catalyst which is a fully featured Perl MVC framework which is catching on very rapidly.

Combine CPAN and Catalyst, and you are really cooking with propane.

Don't build sysadmin scripts in Python or PHP. Use Perl, BASH, awk, sed for those things.

For the Web PHP, Perl, and Python all have strengths and weakness. You could write a thesis on this topic and some do. I'll avoid that.

Summary: despite taking a lot of mostly undeserved guff, Perl remains an excellent choice for everything from one-line one-shot filesystem transforms to full-fledged MVC OO Web Applications.

If you can handle a steep learning curve and *symbolism*, e.g. $# instead of LAST_ELEMENT_IN_THE_ARRAY, Perl will serve you well for many years to come.

Hope That Helps.

m1xram · May 12, 2010, 9:34pm

A steep learning curve means that more information can be learned over a short time. Most people believe, mistakenly, that a steep learning curve is bad. This probably comes from believing they have to climb it rather than it being a graph of knowledge vs time.

Perl does not have a steep learning curve, i.e. It will take you longer to learn than Bash. Perl has more features, five ways to solve any problem, and CPAN.org which has a module for almost anything you can think of.

Perl is well worth learning as it is a very powerful and flexible language. This is in spite of its NON-steep learning curve.

deindorfer · May 12, 2010, 11:25pm

Please See here, m1xram:

Learning curve - Wikipedia, the free encyclopedia

In particular this:

Spot on on Perl

m1xram · May 13, 2010, 12:08am

lol, Wikipedia. It "needs citation" because it is wrong. Would you prefer to learn less over a greater time, with a NON-steep learning curve?

The steeper the curve the more knowledge you get in less time. As the slope approaches infinity you instantaneously know more. Think Matrix upload to your brain.

I wish people would stop using this term, it's embarrassing.

deindorfer · May 13, 2010, 12:24am

m1xram, you are misunderstanding Wikipedia, I'm afraid.

The article/subsection I linked to says there are two interpretations of the term "steep learning curve". The first is yours, which says basically: "you can learn a lot about X quickly".

The second interpretation ( which is more common, and is how people usually use the phrase ) says basically: "how long until i get X mastered or mostly mastered or a real good grip on, etc.?".

In common parlance, people would say that the learning curve of Chess is much steeper than that of Checkers or Tic-Tac-Toe because there is more to learn about Chess, and much more to learn to "master" it.

The learning curve on the phrase "learning curve" is steeper than I had previously thought

unclecameron · May 13, 2010, 12:51pm

I guess a practical example that I built in bash might help, it reads in a text file called ll a line at a time (with latitude/longitude) and finds nearest city and country code from the web.

cat "${@:-$ll}" |
        while read i
        do
                ref_num=$(($ref_num+1))
                lat=`echo $i | awk -F ';' '{print $1}'`
                lon=`echo $i | awk -F ';' '{print $2}'`
                url="http://ws.geonames.org/findNearbyPlaceName?lat=$lat&lng=$lon"
                                curl -s "$url" > xmlfile
                                city=`/usr/bin/xmlstarlet sel -t -m //geoname -v toponymName xmlfile`
                                country_code=`/usr/bin/xmlstarlet sel -t -m //geoname -v countryCode xmlfile`
                                mysql -u $USER_NAME --password=$PASSWORD -D "somedb" -e \
                                        "INSERT INTO somedb.sometable (id, ref_num, timestamp, lat, \
                                        lon, city, country_code) \
                                        VALUES (NULL,'$ref_num','$lat','$lon',\
                                        '$city','$country_code')";
        done

what kinds of things would I have to learn to build this in perl?

deindorfer · May 13, 2010, 4:21pm

Ever Read XKCD? Check this out: xkcd: Geohashing

here's a Perl Script that implements the algorithm in the comic:

#! /usr/bin/perl

use strict;
use warnings;

use LWP::Simple;
use URI::Escape;

use Finance::Quote;
use Digest::MD5;
use Math::BigInt;

my $begin_from = join '', @ARGV if @ARGV;

my $q = Finance::Quote->new;
my %dow = $q->fetch( 'nyse', '^DJI' );
my $hashthis = $dow{'^DJI', 'isodate'} . '-' . $dow{'^DJI', 'open'};
printf "DOWDATE: %s\n", $hashthis;

my $o = Digest::MD5->new;
$o->add( $hashthis );
my $hash = $o->hexdigest;
printf "MD5: %s\n", $hash;

my $x = '.' . substr ( Math::BigInt->from_hex ( '0x' . substr $hash, 0, 16), 0, 6);
my $y = '.' . substr ( Math::BigInt->from_hex ( '0x' . substr $hash, 16, 16), 0, 6);
my $addr = $begin_from || '100 S. Market St., Frederick MD';
$addr = uri_escape( $addr );
my $str = get( "http://rpc.geocoder.us/service/csv?address=$addr" );
( my $start_lat, my $start_long ) = split /,/, $str;
printf "START COORDS: %s %s\n", $start_lat, $start_long;

$start_lat =~ /(.+?)\..+$/; my $new_lat_pref = $1;
$start_long =~ /(.+?)\..+$/; my $new_long_pref = $1;
my $geohash = $new_lat_pref . $x . ' ' . $new_long_pref . $y;
printf "GEOHASH: %s\n", $geohash;

In addition to getting Lat-Long Coords, it also fetchs Real-Time Stock quotes, and runs some fairly solid crytpographic hashing algorithms....

jlliagre · May 13, 2010, 6:41pm

Perl won't be usable as a login shell.

It's negative. The more you know, the more damage you do to your project overall productivity

Depends on what power you measure.

Any other language is probably a better choice than Perl (ok, I'm too extreme here).

php is fine. python would be easier to build than perl.

deindorfer · May 13, 2010, 7:15pm

Write my Geohash Code in Python in Fewer lines and/or so that it executes more quickly. and post please. No copying. The XKCD Geohash is a well known exercise, by now. and we will see. Where's you evidence for such grandiose and unfriendly statements.

You Just say "Perl Sucks" and repeat that 5 times, but don't give any examples; or rebuff any of the the actual specific strong points that three other posters listed here.

More over you general smugness towards Perl thumbs it's nose at literally tens of thousands of people who have added to the Perl extrodinary effort community. PHP and Ruby would not exist were it not for Perl. Perl was the first platform agnostic, run your code on anything byte machine and pioneered so much of what we expect from a high-level language today.

By the way you can make a Perl program your login shell. Just look on CPAN at the Term:: space, for one. /bin/plsh

unclecameron · May 14, 2010, 12:24am

One of the plusses I saw for perl was the depth and breadth of perl code out there in the world, which must, one can surmise, account for its usefulness. I'm just trying to pick from python/perl now that I know it can do most of what I want it to do. It sounds like perl might be a little more versatile still (from what I'm hearing) for both scripting on up through web apps?

deindorfer · May 14, 2010, 2:07am

UncleCameron: Yes. This is an excellent point, and you stated clearly. One of the great advantages of Perl is that it can go toe-to-toe with awk and sed for command-line work; toe-to-toe with BASH/KSH/etc.. for system scripts; go toe-to-toe with PHP for web development; and go toe-to-toe with Python and even Java for Full-Fledged Systems Development.

That versatility is perhaps Perl's greatest strength. I can write any awk one-liner just as quickly and efficiently with perl. I can write a cronjob or the like just quickly and effciently as Shell. I can write a web app just as quickly and efficiently as PHP or Ruby. And I can even write very large scale systems stuff that will run with Java any day.

What other tool has such flexibilty? None. Detractors of Perl will say it tries to be everything and fails at each of it. They never seem to back it up. Perl One-Liners are ofter *better* than awk; perl scripts are almost always better than their BASH equivalent. Perl Web App are often better than their PHP or Python equivalents. etc. And there is massive proof for this.

Moreover. Perl has the most dedicated and caring community of all programming languages. As I mentioned before, no language has had as many smart and talented people give so much of their time willingly.

Perl 6 has struggled to say the least and went off track. Perl6 really *did* bite off a bit too much and has lacked a direction. We should all be running it now, in a perfect world. As a result Perl fell off in some important new areas of practical computing like MVC. Too many people care passionately about Perl to let this go on, and are working everyday to fix it.

Frankly, I am extremely disappointed with the moderator's post, in which he engaged in straight up "Linux vs. BSD vs. WIndows" type discussion with no substance and plenty of misplaced condescension. I hope his colleagues will give him a warning for a clear violation his own forum's rules

( I'm just sick to death of hearing Perl sucks and is useless; when it is runs a significant percent of the entire world's financial transactions. ) Barclays, BofA, Credit Suisse, and others all Use Perl; )

pludi · May 14, 2010, 2:49am

I'll have to step in here and ask everyone to please stop arguing over whether bash, Perl, Python, Ruby, PHP, Whitespace, or BF (sorry if I forgot to list your language of choice here) is better. The OP asked for opinions specific to Perl. If you can answer that, good. If you only answer to provoke someone thinking different than you, stop it or infractions will be handed out.

Starting now, I'll give out infractions, and remove every post not directly relating to the first post in this thread. If you want to argue, do it via PMs.

jlliagre · May 14, 2010, 11:42am

Okay, so let me explain my position again:

Why do you want to convert these scripts in the first place. Performance ? Reliability ? Portability ? Curiosity ? Fun ?

Bash like ksh93 which I prefer and other shells are scripting languages which can be used both interactively and not. (Bourne) Shell syntax has been defined several decades ago and is doing well. Knowing it is kind of something mandatory when you work with Unix. Shells features have increased within the years but shells are still following the Unix philosophy where specialized programs using stdin/stdout/stderr and error codes are used to build complex tasks, instead of having a one fit all approach.
Perl on the other hand isn't designed to be used interactively and certainly cannot be used as a login shell (of course you can use a program written in any language including perl as your login shell, but that's off topic. plsh is written in perl but isn't perl itself). Perl was initially designed to be an improvement over sed and awk which is fine. IMHO, it has derailed when becoming a general purpose programming language.
Unlike shells which leverage them, Perl doesn't encourage to reuse available external utilities. Everything is simpler when using a dedicated perl library explaining the large span of perl modules available. This is an advantage when running in a Unix hostile environment like Windows but I don't see the point in building such a self-centric ecosystem on Unix / Gnu/Linux.
Compared to competing general purpose languages, perl is pretty famous for its cryptic syntax. Someone with a little background with programming languages can read and understand code written in other programming language he doesn't know with very little learning or even not at all. Perl is a big exception. It has many unique idioms you cannot guess or easily find in its documentation. While this might not be an issue for code you write and maintain yourself, this can prove disastrous for code that need to be updated by someone else while the original author is no more available years after that. Almost anyone can pick an old sysadmin or whatever shell script and adapt it to suit new needs. With perl, unless the code has been carefully written and commented to avoid that risk, the more productive way might be to rewrite the whole from scratch when nobody is able to understand it. That's what I meant with negative productivity, I wasn't referring to performance which is a different point. If you really demand performance, just use C for the critical parts. If you want cross platform portability, I would strongly prefer Java which is much more readable and has an even larger community and industry support.

To summarize, I see perl as an addictive/elitist language which doesn't that much worth converting existing scripts to. On the other hand, like most languages, it can be useful and efficient for some specific tasks and might still be a good choice for simple scripts called from the shell.

By the way, I'm not a mod here.

unclecameron · May 14, 2010, 12:50pm

More and more of what I'm doing is advanced manipulation of data, so I'm using awk/sed/grep/whatever to bend my data and do things with it, I just have this nagging feeling with bash that I'm reaching the end of it's capability (and what it was intended for) and I may need to go some direction. I learned C a little bit and also python, C seemed like you had to write a lot more code (definitely just my opinion, don't know whether it's based on anything) to get a given job done. Python was nice, but I have this nagging feeling it wasn't going to scale down to bash and also up to web apps, at least not yet, though I guess there's a pretty active community hacking it right now. I like Java, but I'm not sure it has the flexibility of perl (another opinion of mine, possibly based on nothing), so I thought I'd possibly learn and use perl for a couple years, get some things done with it, and then look at python/java/whatever again. My theory is that the perl code I'd build could be made to "interface" with whatever language I may choose then, so I wouldn't lose anything, as long as I comment whatever I'm doing in my perl scripts. Feel free to correct any of my assumptions, I don't want to waste a couple of years. The comments so far have been VERY helpful, I appreciate both sides very much, it helps me make a better choice.

Corona688 · May 14, 2010, 1:46pm

You're not, really. A lot of the places you're using awk and so forth you're doing things that could be done with bash builtins, and you're still making some beginner mistakes like cat instead of redirection. If you don't mind, I'll improve your script a little. Give me a bit.

---------- Post updated at 11:21 AM ---------- Previous update was at 10:55 AM ----------

# Don't need cat here.  See http://www.partmaps.org/era/unix/award.html
# Don't need awk either.  Bash is perfectly capable of splitting its own input.
for FILE in "${@:-$ll}"
do
        while IFS=";" read lat lon
        do
                # You can also do operations more compactly like this
                ((ref_num++))

                url="http://ws.geonames.org/findNearbyPlaceName?lat=$lat&lng=$lon"
                curl -s "$url" > xmlfile
                # Could maybe have xmlstarlet print both arguments instead of 
                # running it twice?  Not completely sure.
                city=`/usr/bin/xmlstarlet sel -t -m //geoname -v toponymName xmlfile`
                country_code=`/usr/bin/xmlstarlet sel -t -m //geoname -v countryCode xmlfile`

                # We do not need to run mysql N files * M rows times!  One instance can run EVERY query.
                # We're using cat here so we can use a here document for easier formatting.
                # Also note where I substitute ' for \' in case of funny-named cities.
                cat <<EOF
INSERT INTO somedb.sometable (id, ref_num, timestamp, lat,
        lon, city, country_code)
        VALUES (NULL,'${ref_num}','${lat}','${lon}',
        '${city//\'/\'}','${country_code//\'/\'}');
EOF
        done < "$FILE"
done | mysql --batch -u $USER_NAME --password=$PASSWORD -D "somedb"

You can do a lot with pipes. Bourne shell lets you put them wherever. The trick is to not to use them for one-liners if you can help it, they're inefficient that way, but excellent for manipulating arbitrary amounts of data -- connect long-running processes or code sections with them, not pipe a single word through sed. Here we're creating an entire list of insert operations and feeding it into one instance of mysql instead of running mysql N*M times. This is important since process creation times can be significant compared to other operations.

This is what I'd consider the biggest difference between shell scripting and perl, one of shell's most powerful features, and one of the last things people grasp about shells. To get a pipe in perl you have to make an explicit open call, it doesn't come naturally. Nor does it understand redirection and pipes on the statement or code block level and so forth.

unclecameron · May 14, 2010, 3:31pm

thanks for the tips, this is why I support the site, it's a great place to learn from the pros who've been there/done that