Grep three consecutive lines if each lines contains certain string

say we have :

2914 | REQUEST | whatever
2914 | RESPONSE | whatever
2914 | SUCCESS | whatever
2985 | RESPONSE | whatever
2986 | REQUEST | whatever
2990 | REQUEST | whatever
2985 | RESPONSE | whatever
2996 | REQUEST | whatever
2010 | SUCCESS | whatever
2013 | REQUEST | whatever
2013 | RESPONSE | whatever
2013 | SUCCESS | whatever
2076 | REQUEST | whatever
2078 | RESPONSE | whatever

should print:

2914 | REQUEST | whatever
2914 | RESPONSE | whatever
2914 | SUCCESS | whatever
2013 | REQUEST | whatever
2013 | RESPONSE | whatever
2013 | SUCCESS | whatever

already in one of the threads below command is used for 2 consecutive line. I want similar command for 3 consecutive line:

awk -F"|" '$2 ~ "REQUEST" {s=$0;f=1;next} f && $2 ~ "RESPONSE" {print s RS $0;f=0}' file

Also, please explain how the command works. I have very basic knowledge about awk.

That is a very vague description. Please be way more detailed and specific. Why should we select 2914 | REQUEST and not 2986 | REQUEST ? Your code snippet would print a "REQUEST" line and the very next "RESPONSE" line coming up, independent of contents.

Sonu Pandy , right ? :smiley:

Something like this (untested)

awk -F"|" '
f==2 && $2~"SUCCESS" {print s RS $0; f=0}
f==1 && $2~"RESPONSE" {s=s RS $0; f=2}
$2~"REQUEST" {s=$0; f=1}
' file
1 Like

I want to print a pattern with request, response and success in order. Whenever they are together it should print and skip all others.

like the below pattern irrespective of the number given in beginning:

2914 | REQUEST | whatever
2914 | RESPONSE | whatever
2914 | SUCCESS | whatever

Hello Saumitra,

If understood your requirement correctly then following may help you in same where you need to have columns values in sequence of REQUEST RESPONSE SUCCESS and column 1 should have all 3 then only it should print it. If you have some other requirement please let us know with sample input and all conditions with sample expected output.

awk -F' +| +' 'FNR==NR{A[$1 OFS $3]=$0;next} (($1 OFS "REQUEST") in A){O=A[$1 OFS "REQUEST"];B[$1]++} (($1 OFS "RESPONSE") in A){O=O ORS A[$1 OFS "RESPONSE"];;B[$1]++} (($1 OFS "SUCCESS") in A){O=O ORS A[$1 OFS "SUCCESS"];;B[$1]++}{if(B[$1]==3 && O !~ /^$/){print O;O="";}}'  Input_file Input_file
 

Output will be as follows.

2914 | REQUEST | whatever
2914 | RESPONSE | whatever
2914 | SUCCESS | whatever
2013 | REQUEST | whatever
2013 | RESPONSE | whatever
2013 | SUCCESS | whatever
 

EDIT: Adding a non-one liner form for solution here too.

awk -F' +| +' 'FNR==NR{
                        A[$1 OFS $3]=$0;
                        next
                      }
               (($1 OFS "REQUEST") in A)        {
                                                        O=A[$1 OFS "REQUEST"];
                                                        B[$1]++
                                                }
               (($1 OFS "RESPONSE") in A)       {
                                                        O=O ORS A[$1 OFS "RESPONSE"];
                                                        B[$1]++
                                                }
               (($1 OFS "SUCCESS") in A)        {
                                                        O=O ORS A[$1 OFS "SUCCESS"];
                                                        B[$1]++
                                                }
                                                {
                                                        if(B[$1]==3 && O !~ /^$/){
                                                                                        print O;
                                                                                        O="";
                                                                                        i=""
                                                                                 }
                                                }
               ' Input_file  Input_file
 

Thanks,
R. Singh

thanks buddy. this worked.. :slight_smile:

---------- Post updated at 07:17 PM ---------- Previous update was at 06:03 PM ----------

Hey Ravindra,

Thanks for your solution. The code looked a bit complicated to me. It's working fine but it's skipping some lines which it should not:

[abc]$ cat text
2985 | RESPONSE | whatever
2990 | REQUEST | whatever
2985 | RESPONSE | whatever
2996 | REQUEST | whatever
2010 | SUCCESS | whatever
asdgahna
2013 | REQUEST | whatever
2013 | RESPONSE | whatever
2013 | SUCCESS | whatever
afbb
2076 | REQUEST | whatever
2078 | RESPONSE | whatever
2452 | SUCCESS
[abc]$ ./script.sh
2013 | REQUEST | whatever
2013 | RESPONSE | whatever
2013 | SUCCESS | whatever

It didn't print :

2076 | REQUEST | whatever
2078 | RESPONSE | whatever
2452 | SUCCESS

I guess it's my mistake I didn't explain my query completely.
anyways thanks a lot.

if you are sure that it is impossible for either of REQUEST/RESPONSE/SUCCESS to appear more than once in terms of your key from column 1. Then below awk should work for you.

awk 'BEGIN{FS=" +| +"}
{
arr[$1]=arr[$1]"\n"$0
brr[$1]++
}
END{
for(i in arr)
if(brr==3)
print arr
}' a

python

import re
lines={}
actions={}
with open("a.txt") as file:
	for line in file:
		items=re.compile("\s+|\s+").split(line)
		if items[0] in lines:
			lines[items[0]]=lines[items[0]]+line
		else:
			lines[items[0]]=line
		if items[0] in actions:
			actions[items[0]]=actions[items[0]]+'_'+items[2]
		else:
			actions[items[0]]=items[2]
for key in lines:
	if 'REQUEST' in actions[key] and 'RESPONSE' in actions[key] and 'SUCCESS' in actions[key]:
		print(lines[key])