ksh pattern matching

I try to use the pattern substitution operators as documented in the O'Reilley "Learning the Korn Shell" but it doesn't seem to work as advertised.

This works all right:

var='Regular expressions rules!'
$ echo ${var//e/#}
R#gular #xpr#ssions rul#s!

The docs says that using !(expr) matches anything that does't match expr but if I try to replace all but the "e" character, it does not seem to work:

var='Regular expressions rules!'
$ echo ${var//!(e)/#}
#

Any idea?

Hi.

The newer shells, ksh and bash, have a lot of syntactical elements that are easily confused with one another.

The "Pattern Substitution Operators" syntax:

${variable_name}

can have a number of substitution operations with #, %, etc. They use the meta-characters, *, [], and ? -- page 123 ff, Learning the Korn Shell, 2nd Edition ("LTKS").

The "Patterns and Regular Expression" syntax uses:

*(exp), ?(exp), !(exp) ...

which correspond to the usual syntax we find in grep, etc:

grep "e*" ...

These patterns could be used within double brackets, for example:

if [[ $var == *!(e)* ]]

but not with string operator syntax (as far as I know) -- page 113 ff, 144 ff.

The ksh I use (pdksh, even on Solaris) notes a bad substitution for what I think is the right thing, but bash does it correctly in my opinion. Here's an example:

#!/bin/bash -
#!/bin/ksh -

# @(#) s1       Demonstrate string operators.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1)

var='String operators rule!'
echo
echo " Replace e with _:"
echo ${var//e/_}

echo
echo " Replace everything except e with _:"
echo ${var//[^e]/_}

exit 0

Producing:

% ./s1
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0

 Replace e with _:
String op_rators rul_!

 Replace everything except e with _:
_________e__________e_

Perhaps someone will stop by with a better explanation or a better suggestion ... cheers, drl

They do partially work in my ksh version (1993-12-28 r):

var='jo mike and dave are good friends'

$ echo ${var//a?(re)/_}
# returns > jo mike _nd d_ve _ good friends

$ echo ${var//g*(o)/_}
#returns > jo mike and dave are _d friends

$ echo ${var//+(o)/_}
#returns > j_ mike and dave are g_d friends

$ echo ${var//@(jo|dave)/_}
#returns > _ mike and _ are good friends

All returns as expected but I try to use the !(exp) like the PCRE look behind assertions (?<=exp). Still trying...

It matters which version of ksh you are using. ksh93 has the // syntax while ksh88 does not. I am not sure about pdksh. On Solaris, dtksh is a souped vesion of ksh93. With dtksh...

$ /usr/dt/bin/dtksh
$ set -o emacs
$
$ x=hello
$ echo ${x//l/X} ${x//[!l]/X}
heXXo XXllX
$

I assume you are talking about section 4.3 (String Operators) of LTKS

#!/usr/bin/ksh93

echo ${.sh.version}
var='A regular expressions test'

echo "1>  //e/#"
echo ${var//e/#}
echo "2>  //[^e]/#"
echo ${var//[^e]/#}
echo "3>  //+(e)/#"
echo ${var//+(e)/#}
echo "4>  //-(e)/#"
echo ${var//-(e)/#}
echo "5>  //?(e)/#"
echo ${var//?(e)/#}
echo "6>  //*(e)/#"
echo ${var//*(e)/#}
echo "7>  //!(e)/#"
echo ${var//!(e)/#}

Gives the following output

Version M 1993-12-28 s+
1>  //e/#
A r#gular #xpr#ssions t#st
2>  //[^e]/#
###e######e###e########e##
3>  //+(e)/#
A r#gular #xpr#ssions t#st
4>  //-(e)/#
A regular expressions test
5>  //?(e)/#
###########################
6>  //*(e)/#
###########################
7>  //!(e)/#
#

Interesting! I am not sure what is going on.

Hi.

That's quite an array of very different results. If I were aiming for portability (which I usually am), I'd probably use the old standby sed:

#!/usr/dt/bin/dtksh
#!/usr/bin/ksh -
#!/usr/xpg4/bin/sh -
#!/bin/ksh -
#!/bin/bash -

# @(#) s2       Demonstrate pattern matching in dtksh and sed.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1)

var='Regular expressions rules!'
echo
echo " Replace e with _:"
echo ${var//e/_}

echo
echo " Replace everything except e with _:"
echo re ${var//!(e)/_}
echo fe ${var//[!e]/_}

echo
echo " Replace everything except e with _ using sed:"
echo "$var" | sed -e 's|[^e]|_|g'

exit 0

Producing:

$ ./s2
(Versions displayed with local utility "version")
SunOS 5.10
dtksh M-12/28/93d

 Replace e with _:
R_gular _xpr_ssions rul_s!

 Replace everything except e with _:
re _
fe _e______e___e__________e__

 Replace everything except e with _ using sed:
_e______e___e__________e__

The re and fe above are regular expressions and filename expressions, intended to show the different syntax, and how one works and the other does not.

I got something out of this, namely dtksh, thanks to Perderabo. It's a bit tricky to find. I think it also carries a very large load of graphical baggage -- sort of like Tk (tcl/Tk). I dug through my very old pile of books, finding this:

and at almost 1,000 pages, you can tell there is a lot. The size on Solaris X86 is 620144, even larger than bash.

You pays your money and you takes your chances (quoting either the cartoon character Popeye or one of my previous bosses :slight_smile: ) ... cheers, drl