How to combine lines within range of pattern

neg · September 5, 2009, 10:07am

I've a file say having
line 1
line 2
(NP
line 3
line 4
line 5)
line 6

I want to combine lines starting from (NP and ending with ) then it will look like
line 1
line 2
(NP line3 line4 line5)
line 6

I tried using

sed '/(NP/,/)$/ s/\n/ /'

but it's not working. Any help please?

Thanks in advance.

cfajohnson · September 5, 2009, 10:28am

awk '
{printf "%s ", $0}
/)$/ { print "" }
/\(/,/\)$/ { next }
{print ""}'

ripat · September 5, 2009, 10:53am

Another way:

awk '/NP/ {ORS=" "} /\)$/ {ORS="\n"}{print} file'

neg · September 5, 2009, 10:56am

Thanks guys, it worked.
Cheers.

drl · September 5, 2009, 3:25pm

Hi.

If you like the /first/, /last/ notation of sed, there is a similar construct in perl, namely the range operator, "..", and, in the context of lines, it works as you wanted the sed construct to work (it can also used as a list generator). The perl script is a bit more verbose than the awk scripts, but it may be more readable:

#!/usr/bin/perl

# @(#) p2	Demonstrate line-join within specific range.

use warnings;
use strict;

my ($debug);
$debug = 0;
$debug = 1;

my ($in_sequence) = 0;
while (<>) {
  if ( /[(]NP/ .. /[)]$/ ) {
    $in_sequence++;
    chomp;
    print "$_ ";
    next;
  }
  elsif ($in_sequence) {
    $in_sequence = 0;
    print "\n";
  }
  print;
}

exit(0);

Assuming data is on file "data1", the script then produces:

% ./p2 data1
line 1
line 2
(NP line 3 line 4 line 5) 
line 6

The parentheses are special in the matching operation, so we escape them. One way is to precede them with a backslash, but some people like readability of single characters in square brackets.

We go through the file and whenever we are in the appropriate range, we print the line without the newline, and move onto the next line. If outside, we check to see if we have completed a join, and if so, print a newline, print the current line in any case, and loop. There is also a triple-dot operator, research on which is left as an exercise for the reader.

Best wishes... cheers, drl

neg · September 6, 2009, 12:41am

Thanks drl, I appreciate your consideration. I had enuf reply from all of you.
Cheers up

durden_tyler · September 6, 2009, 2:36am

Yet another solution:

$ 
$ cat f1
line 1
line 2
(NP
line 3
line 4
line 5)
line 6
$ 
$ 
$ ##
$ perl -ne 'BEGIN{$x=0} chomp;
>           if (/^\(/){$x=1; $c=1}
>           elsif (/\)$/){$x=0; $c=0}
>           elsif ($x==1){$c=1}
>           else {$c=0}
>           if ($c == 1) {printf("%s ", $_)}
>           else {printf("%s\n", $_)}' f1
line 1
line 2
(NP line 3 line 4 line 5)
line 6
$ 
$

tyler_durden

summer_cherry · September 6, 2009, 10:44pm

local $/="";
my $str=<DATA>;
$str=~s/\n(?=([^\(\n]*\n*)*\))/ /g;
print $str;
__DATA__
line 1
line 2
(NP
line 3
line 4
line 5)
line 6
(
line 7
line 8
)
line 9

mrtiller · September 7, 2009, 2:04pm

If you have your heart set on doing it in sed:

sed '
/(NP/ {
:loop
N
s/\n/ /
/)$/ !b loop
}
' file

I've always had problems using "\n" on the left side of "s" commands, but if it's embedded in the pattern space it seems to work. Go figure.