Multi-line filtering based on multi-line pattern in a file

Finja · May 16, 2012, 10:12am

I have a file with data records separated by multiple equals signs, as below.

==========
RECORD 1

RECORD 2
DATA LINE

RECORD 3

RECORD 4
DATA LINE

RECORD 5
DATA LINE

I need to filter out all data from this file where the record has no data lines. So in the example above, I would need RECORD 1 and RECORD 3 to be filtered out, plus the extra rows of equals signs separating them. So I would like the output to be:

==========
RECORD 2
DATA LINE

RECORD 4
DATA LINE

RECORD 5
DATA LINE

I can get done whan I need on single lines, but I am stuck trying to match the pattern over multiple lines, and then deleting multiple lines.

neutronscott · May 16, 2012, 10:39am

awk works with records. you can separate by other delimiter than a newline. it's the RS variable.

$ awk 'BEGIN{RS=ORS="==========\n";FS="\n"}NF>2' input
RECORD 2
DATA LINE
==========
RECORD 4
DATA LINE
==========
RECORD 5
DATA LINE
==========

bakunin · May 16, 2012, 10:42am

Use awk.

You can reassign the "RS" special variable, which separates records and is a newline per default, to some arbitrary value - in your case the several equal signs.

I hope this helps.

bakunin

/Edit: neutronscott beat me to it.

Multi-line filtering based on multi-line pattern in a file

========== RECORD 1

RECORD 2 DATA LINE

RECORD 3

RECORD 4 DATA LINE

RECORD 5 DATA LINE

========== RECORD 2 DATA LINE

RECORD 4 DATA LINE

RECORD 5 DATA LINE

==========
RECORD 1

RECORD 2
DATA LINE

RECORD 4
DATA LINE

RECORD 5
DATA LINE

==========
RECORD 2
DATA LINE

RECORD 4
DATA LINE

RECORD 5
DATA LINE