Regular expression to match multiple lines?

LessNux · January 18, 2014, 9:24am

Using a regular expression, I would like multiple lines to be matched.

By default, a period (.) matches any character except newline. However, (?s) and /s modifiers are supposed to force . to accept a newline and to match any character including a newline.

However, the following two perl statements that use (?s) and /s failed to find a pattern spanning multiple lines.

perl -p -e 's/a(?s).*f/z/' srcfile > dstfile

and

perl -p -e 's/a.*f/z/s' srcfile > dstfile

where the content of srcfile is

abc
def
ghi
jkl

which can be created by

cat > srcfile << EOF
abc
def
ghi
jkl
EOF

I wanted the regular expression to match the string consisting of two lines that starts with "a" and ends with "f".

In other words, I wanted to replace

abc
def
with z.

So, I wanted dstfile to become

z
ghi
jkl

However, the above perl statements failed. The regular expressions in the above perl statements matched nothing, making dstfile identical to srcfile.

What went wrong?

What regular expression would match multiple lines?

How can a perl or bash command line find a pattern spanning multiple lines in srcfile, replace it with another, and save the modified text into dstfile?

Many thanks, in advance.

Scrutinizer · January 18, 2014, 9:54am

perl -pe operates on a line-by-line basis, so it will not match a multiline pattern
You could try something like (within a paragraph):

perl -00 -pe 's/a.*f/z/s' file

or (within a file)

perl -0777 -pe 's/a.*f/z/s' file

drl · January 18, 2014, 3:26pm

Hi.

Similarly, given file data1:

abc
def
ghi
jkl

and perl code p1:

#!/usr/bin/env perl

# @(#) p1	Demonstrate slurp and single-string match.

use strict;
use warnings;

my $a = slurp();
$a =~ s/abc.*f/z/s;
print $a;

exit(0);

# Best practices, p213 for a file.
sub slurp {
  my $scalar = do { local $/; <> };
  return $scalar;
}

then:

$ ./p1 data1
z
ghi
jkl

See perldoc perlre for details.

Best wishes ... cheers, drl

LessNux · January 18, 2014, 6:57pm

Thank you for your replies, Scrutinizer and drl.

drl wrote:

I do not understand what are meant by "@(#)", "p1" and "p213".

Does "@(#)" have anything to do with an array variable?

Do p1 and p213 mean page 1 and page 213 of a book or pdf document?

My copy of perlre does not have any page numbers printed.

Many thanks, in advance.

drl · January 18, 2014, 11:38pm

Hi.

The shell, perl, awk, etc. all ignore anything after an unquoted "#". The string "@(#)" is a special key so that a one-line description of the script can be extracted. For example using script p1 as input: $ what ./p1 wil produce this on standatrd output:

p1	Demonstrate slurp and single-string match.

This is an old convention, but we have found it useful to generate local indices of scripts. We have written a script to do this as well as create the indices for our shop. You might find an heirloom man page for command what. See an example of the string at bash - shell script templates - Stack Overflow

The string "p1" is the name of the file in which the perl script resides.

The string "p213" refers to the page number in the book Amazon.com: Perl Best Practices eBook: Damian Conway: Books

Best wishes ... cheers, drl