Grep lines between two specific words after matching pattern

grep specific number of lines from file after matching pattern

I want to grep all the lines between keyword 'start' to 'end' after matching pattern/ number 12345

Is it possible?
Thanks in advance

Assuming Linux:

a=3
b=3
grep -A$a -B$b  '12345'  somefile

This searches from 3 lines before and 3 line after the keyword is found. Total 7 lines, including keyword '12345'

Hello sagar_1986,

Could you please do share your efforts which you have put in order to solve your own problems?
We encourage users to learn coding on this forum, so please do share so.

Thanks,
R. Singh

Hello RavinderSingh13 ,

Actually i know how to get 'n' ( fixed) number of lines before and after the matching pattern, but here the issue is that position of 'start' and 'end' is not fixed.
Here i want to grep first occurrence of 'start' before matching pattern and first occurrence of 'end' after matching pattern and i don't have any idea how to do this, could you please help.

Hello sagar_1986,

Again you are missing the point, request to you is to add your efforts in form of code; so kindly do so and let us know then.

Thansk,
R. Singh

awk '/start/{flag=1} flag; /end/{flag=0}' sample.txt  
awk '/start/,/123456/,/end/'  sample.txt 
sed '/start.*123456.*end/!d' sample.txt
sed -n '/start.*123456.*end/p' sample.txt
 
sed -e '/./{H;$!d;}' -e 'x;/123456/!d; sample.txt' 
sed -e '/./{H;$!d;}' -e 'x;/start/!d;/123456/!d;/end/!d' sample.txt 

so how to get lines which contains three matching patterns

1 Like

You can see one way of doing on post #2

If it's not a big file, you can get a simple to understand but clunky way by using the output of grep -n "start" $filename and grep -n "end" $filename to get you the record numbers to search between and then perhaps a sed -n "$start_line,$end_line"p $filename

This would be slow with a very large file though because you would read it all three times.

Does this help, or is your file big enough to warrant a solution that just reads it once?

Kind regards,
Robin

Do you want the End pattern excluded? Try

awk '/Start/,/End/ {if (/12345/) P = 1; if (/End/) P = 0} P' file
12345
.
.

EDIT: included?

awk '/Start/,/End/ {if (/12345/) P = 1; if (P) print; if (/End/) P = 0}' file
12345
.
.
End

If there are more pattern pairs found in the file (which is not specified nor found in sample input) we need to rethink.

Dear rbatte1,

Dear RudiC,

As per your suggestion, i have tried the solution on sample file and output is like this

grep -n "start" $filename
grep -n "stop" $filename

there will be multiple matching patterns, hence need to grep with 3 matching pattern

grep -zoP "(?s)Start.*12345.*End" file
awk '/Start/{p=$0; next} p{p=p ORS $0} /End/{if(p~/12345/)print p; p=x}' file

Or, using vertical real estate:

awk '
  /Start/ {
    buffer=$0
    next
  } 
  buffer {
    buffer=buffer ORS $0
  } 
  /End/ {
    if(buffer~/12345/) print buffer
    buffer=""
  }
' file 

The previous did not work for me.
Perhaps because of the mistake that a test buffer makes assumptions about the contents that can go wrong.
Better have a separate state variable (here: buffer_on)

awk '
  buffer_on {
    buffer=buffer ORS $0
  }
  ! buffer_on && /Start/ {
    buffer_on=1
    buffer=$0
  }
  buffer_on && /End/ {
    if (buffer~/12345/) print buffer
    buffer_on=0
  }
' file

--- Post updated at 12:07 ---

Introducing a separator variable:
one can include/exclude the Start and/or End pattern by simply changing the order of the 3 code blocks.

awk '
  buffer_on {
    buffer=buffer ors $0
    ors=ORS
  }
  ! buffer_on && /Start/ {
    buffer_on=1
    buffer=ors=""
  }
  buffer_on && /End/ {
    if (buffer~/12345/) print buffer
    buffer_on=0
  }
' file
awk '
  ! buffer_on && /Start/ {
    buffer_on=1
    buffer=ors=""
  }
  buffer_on {
    buffer=buffer ors $0
    ors=ORS
  }
  buffer_on && /End/ {
    if (buffer~/12345/) print buffer
    buffer_on=0
  }
' file
awk '
  ! buffer_on && /Start/ {
    buffer_on=1
    buffer=ors=""
  }
   buffer_on && /End/ {
    if (buffer~/12345/) print buffer
    buffer_on=0
  }
  buffer_on {
    buffer=buffer ors $0
    ors=ORS
  }
' file
awk '
   buffer_on && /End/ {
    if (buffer~/12345/) print buffer
    buffer_on=0
  }
  buffer_on {
    buffer=buffer ors $0
    ors=ORS
  }
  ! buffer_on && /Start/ {
    buffer_on=1
    buffer=ors=""
  }
' file

Dear Scrutinizer and MadeInGermany ,

both solutions are working fine for thank you so much.

Thanks rbatte1,

I have tried your approach also, it works fine, but issue is that sed -n "$start_line,$end_line"p $filename is not working
calling variable inside sed is not working, sed -n "3,15"p $filename is working fine but sed -n "$start_line,$end_line"p is not, is there any alternate solution.
As per your suggestion i have tried this


# input of sample.gmf is like this, without quotes, "DOCSTART_2 |" for start
  #input of sample.gmf is like this, without quotes,"DOCEND |" for end 

# input for third matching pattern is "343"

 
# after grep -n the output is like this "1:DOCSTART_2 |" or "520:DOCEND |"


grep -n "DOCSTART_2" /home/testing/sagar/sample.GMF | awk -F ":" '{print $1}'  >  cat /home/testing/sagar/DOCSTART_2    ## start line numbers for entire file 

grep -n "DOCEND" /home/testing/sagar/sample.GMF | awk -F ":" '{print $1}'  >  cat /home/testing/sagar/DOCEND   ## end line numbers for entire file

input=`grep -n "12345" /home/testing/sagar/sample.GMF | awk -F ":" '{print $1}'`     # matching pattern (343)
  
> /home/clarity/sagar/less_DOCSTART_2
> /home/clarity/sagar/great_DOCEND

for file in `cat /home/testing/sagar/DOCSTART_2`
do
a=`echo $file`
if [ $a -lt $input ]
then
echo $a >> /home/clarity/sagar/less_DOCSTART_2
else
echo hi >> /dev/null
fi
done
 DOCSTART=`sort -n  /home/clarity/sagar/less_DOCSTART_2 | tail -1`  ## greatest start number


 for file1 in `cat /home/clarity/sagar/DOCEND`
do
b=`echo $file1 | awk -F ":" '{print $1}'`
if [ $b -gt $input ]
then
echo $a >> /home/clarity/sagar/great_DOCEND
else
echo hi >> /dev/null
fi
done
DOCEND=`sort -n  /home/clarity/sagar/great_DOCEND | head -1`   ## lowest end number
cat /home/testing/sagar/sample.GMF | sed -n "$DOCSTART,$DOCEND"'p  > /home/testing/sagar/sample.GMF_new   ##### not working

    

any suggestions or any changes in approach.
Thanks in advance.