Pattern matching in file and then display 10 lines above every time

namishtiwari · January 29, 2008, 6:00am

hiii,

i have to write a shell script like this----

i have a huge log file name abc.log .i have to search for a pattern name "pattern",it may occur 1000 times in the log file,every time it finds the pattern it should display the 10 lines above the pattern.
I appericiate your help.

mirusnet · January 29, 2008, 6:24am

 grep -B 10 �pattern� file

namishtiwari · January 29, 2008, 6:28am

It's telling illegal option -B;I am using HP-UX.

ennstate · January 29, 2008, 8:05am

Most Linux grep version supports that option,i guess other dont do,

I have the script to do the job,

#!/bin/ksh
Usage() {
 echo " mygrep pat files.."
 exit
}

(( $# < 2 )) && Usage

Leading=10

PAT=$1
shift;
FileList=$@

for FileName in ${FileList} ; do
  for LineNo in $( grep -n  $PAT $FileName | cut -d":" -f1 )  ; do
     From=$((LineNo-Leading+1))
     sed -n "$From,$LineNo"p $FileName
  done
done

Thanks
Nagarajan G

drl · January 30, 2008, 7:24pm

Hi.

If I understand the problem and the solution from ennstate, then sed will be loaded up 1000 times. Here is an alternate solution. The attachment is a perl script, pvg (perl version of grep, very limited edition). Assuming that you have perl, this will do simple matching and it will also do the "-B n" behavior of GNU grep. This can then be called once to process the log file. (Get the attachment, rename it to "pvg", then run the test script.) Execute "./pvg -h" for a help page.

Here's a test script:

#!/bin/bash -

# @(#) s3       Demonstrate printing of text before a matched line.

echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version =o $(_eat $0 $1)

P=./pvg

echo
echo " Input file data3:"
cat data3

echo
echo " Looking for d, normal search:"
$P d data3

echo
echo " Looking for d, print previous 2:"
$P -B 2 d data3

echo
echo " Looking for d, print previous 2, separator:"
$P -s '     ----- \n' -B 2 d data3

echo
echo " Looking for ask, print previous 100, quiet:"
$P -q -B 100 ask data3

exit 0

Producing:

% ./s3
(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0

 Input file data3:
     1  Alabama AL
     2  Alaska AK
     3  Arizona AZ
     4  Arkansas AR
     5  California CA
     6  Colorado CO
     7  Connecticut CT
     8  Delaware DE
     9  District of Columbia DC
    10  Florida FL

 Looking for d, normal search:
     6  Colorado CO
    10  Florida FL
 ( Lines read: 10; hits: 2 )

 Looking for d, print previous 2:
     4  Arkansas AR
     5  California CA
     6  Colorado CO
     8  Delaware DE
     9  District of Columbia DC
    10  Florida FL
 ( Lines read: 10; hits: 2 )

 Looking for d, print previous 2, separator:
     4  Arkansas AR
     5  California CA
     6  Colorado CO
     -----
     8  Delaware DE
     9  District of Columbia DC
    10  Florida FL
 ( Lines read: 10; hits: 2 )

 Looking for ask, print previous 100, quiet:
     1  Alabama AL
     2  Alaska AK

It can be a lot of work to make sure that perl and friends are installed, but once it's done, then you can use it for many things. Most systems have it already available.

There is also cgrep at freshmeat.net: Project details for cgrep which has a wealth of features. You need to go through a compilation, but it was fairly painless.

Best wishes ... cheers, drl

KevinADC · January 30, 2008, 7:59pm

The English module is known to create inefficiencies in perl programs. See the English man page for details: English - perldoc.perl.org

I am not sure if it affects the code you posted drl but you may want to take a look.

KevinADC · January 30, 2008, 8:05pm

If I was going to use perl I would just use Tie::File, then you can treat the file like an array and use array subscripting to backtrack 10 lines. But he seems to want a shell script so I can't help with that unless calling a perl script is OK.

ghostdog74 · January 30, 2008, 8:35pm

not tested for performance

awk '
{a[NR]=$0}
/pattern/{ 
   for(i=10;i>=1;i--){
    print a[NR-i]
   }
   for ( j=1;j<=NR-10;j++){
    delete a[j]
   }
} 
' "file"

drl · January 30, 2008, 8:59pm

Hi, KevinADC.

I looked over the reference, and added the sequence qw( -no_match_vars ) to my template. The warning is about a degradation in RE matching specifically. I added it to the short code I posted and it didn't seem to make a noticeable difference in a quick test (2 lines matched from the US Constitution from about 1000 lines). On the other hand, it's an easy thing to add.

Thanks for making me aware of that. Damian Conway does mention that in Perl Best Practices, but I haven't implemented all of that yet ... cheers, drl

summer_cherry · January 31, 2008, 12:50am

Hi
This one maybe ok for you,just try!

nawk '{
line[NR]=$0
if (index($0,"pattern")!=0)
for (i=NR-10;i<=NR;i++)
print line
}' filename

Karthikeyan_113 · January 31, 2008, 1:24am

Just give it a try and see.

for x in `grep -n "pattern" <filename> |cut -d":" -f1`
do
head -$x <filename> | tail -10
done

Its working but performance issue may be there.

KevinADC · January 31, 2008, 1:57am

I think perl would be good for this, it certainly should be easy. Tie::File is good for working with large files since it does not read them into memory, but it does alter the original file so you have to make sure to open the file in readonly mode. A preliminary script:

#!/usr/bin/perl
use strict;
use warnings;
use Tie::File;
use Fcntl 'O_RDONLY';

tie my @file, 'Tie::File', 'path/to/file', mode => O_RDONLY
    or die "Can't open path/to/file: $!";
foreach my $i (10 .. $#file) {
    if ($file[$i] =~ /pattern/) {
        print qq{"pattern" found on line },$i+1,"$/";
        print qq{previous ten lines:$/}; 
        print map{"$_$/"} @file[$i-10 .. $i-1], "----------------------------";
    }
}

I am not sure how efficient/ineffiecient this is though.

ghostdog74 · January 31, 2008, 4:24am

I can see that if there are more than 1000 results, it will call heads and tails on the file for more than 1000 times.

ohagar · October 20, 2008, 8:48pm

gnu grep has the -B option ( lines before ) and a -A option ( lines after ) as well. You may find this on your box as ggrep or something like that, otherwise you could certainly get the gnu grep out here somewhere:
Installing GCC: Binaries - GNU Project - Free Software Foundation (FSF)

hagar the horrible

KevinADC · October 20, 2008, 9:38pm

After 9 months the thread is a bit stale but maybe your post will still be helpful.

summer_cherry · October 20, 2008, 10:44pm

below sed command is use to print out above 3 lines when match 'pat'.

sed -n '/pat/ !{
1{
   h
}
1 !{
   H
}
}
/pat/ {
H
x
s/\(.*\)\n\(.*\)\n\(.*\)\n\(.*\)\n\(pat\)/\2 \3 \4/
p
}' filename

sethcoop · October 20, 2008, 10:50pm

What about a simple little perl script????.... the 10 lines above the word "total" will be printed... it will read a file called file.txt.

#!/usr/local/bin/perl
@array;
open(F, "file.txt") or die "cannot read file";
while(<F>) {
  chomp;
  $my_line = "$_";
 
  if ("$my_line" =~ "total") {
    foreach(@array){
      print "$_\n";
    }
    print "========================================================\n"
  }
  push(@array,$my_line);
  if ("$#array" > "9") {
    shift(@array);
  }
};

digitalrg · April 8, 2009, 8:27am

Thank you sethcoop.. but can you please tell me how to print 10 lines below the occurance of word "total" please help.....
Thanks

sethcoop:

What about a simple little perl script????.... the 10 lines above the word "total" will be printed... it will read a file called file.txt.

#!/usr/local/bin/perl
@array;
open(F, "file.txt") or die "cannot read file";
while(<F>) {
  chomp;
  $my_line = "$_";
 
  if ("$my_line" =~ "total") {
   foreach(@array){
   print "$_\n";
   }
   print "========================================================\n"
  }
  push(@array,$my_line);
  if ("$#array" > "9") {
   shift(@array);
  }
};

sethcoop · April 8, 2009, 10:56am

Here is the code to print the number of lines below your search... you can change the variable $lines to be how many lines you want to print after the word total.

Hope this helps!

#!/usr/bin/perl
$lines = 10;
$x = $lines;
open(F, "file.txt") or die "cannot read file";
while(<F>) {
  chomp;
  $my_line = "$_";
  if ("$x" < "$lines") {
    print "$my_line\n";
    $x++;
  }
  if ("$my_line" =~ "total") {
    $x = 0;
  }
};

KevinADC · April 8, 2009, 11:49am

I know sethcoop is trying to help but his perl code is not well written, here is what you want to do:

#!/usr/bin/perl
use strict;
use warnings;
my $lines = 10;
open(F, "file.txt") or die "cannot read file";
while(<F>) {
  if (/total/)  {
      print scalar <F> for (1..$lines);
      last;
  }
}
close(F);

Couple of things. If you need to find "total" more than once, remove the "last" command" otherwise it quits after the first match. If "total" is not a substring of another word add the \b anchor to the regexp so you don't get any false substring matches for words like "totals" instead of just "total".

if (/\btotal\b/)  {

add "i" to the regexp to make it case-insensitive if necessary so "TOTAL" and "total" will match:

if (/\btotal\b/i)  {