How to Grep than scan line below grep pattern

umarsatti · October 3, 2013, 2:31pm

Hello Colleagues,

I have a file that looks like below.

6-12731913-12731913
9230760143480
410018547148230
20131002193434+0500
20131002193434+0500
;20131002T161031000-10.50.241.21-21912131-1419034760, ver: 0
20131009
92220056296730
CC0P
abc
Core_Context_R1A
SMS
6-12726796-12726796
9230349094420
410016350040971
20131002193434+0500
20131002193434+0500
;20131002T161031000-10.50.241.21-21912131-1419034760, ver: 0
20131008
92775313215350
92DDD060
CC0P
abc
Core_Context_R1A
Voice
6-12725266-12725266
9230172005830
410018898077989
20131002193434+0500
20131002193434+0500
920131002T164612000-10.50.241.21-17667023-92612900, ver: 0
20131009
0P(h3
92780065437090
0P(h3
CC0P
abc
Core_Context_R1A
GPRS
6-12726796-12726796
9230349094420
410016350040971
20131002193439+0500
20131002193439+0500
;20131002T161031000-10.50.241.21-21912131-1419034760, ver: 0
20131008
92775313215350
92DDD060
CC0P
abc
Core_Context_R1A
Voice

Now I want to grep 20131002 and if it finds pattern "20131002" it should scan lines below "20131002" and if it finds the word "Voice" that comes after grepping "20131002, than script print out all such occurrences in the file. Script Output would look like below.

20131002193434+0500 Voice
20131002193439+0500 Voice

Can any one please help me, how to implement this using shell script?

Regards,
Umar

joeyg · October 3, 2013, 2:41pm

What about:

cat -n myfile | grep "20131002" | cut -f1

tells you the line #
and so does

cat -n myfile | grep "Voice" | cut -f1

so, you could assign each of these to a variable, and then compare the two variables?

Corona688 · October 3, 2013, 2:43pm

If the very first line of the file has 20131002, and the very last line of the file has Voice, with no other hits inbetween -- is that a valid match? If not, why not?

In short, what tells it where to stop?

umarsatti · October 3, 2013, 2:46pm

No first line is not "20131002" and last line is not "Voice", script first grep 20131002 and than scan lines below as soon as it come across first "Voice" it prints and search for other such patterns by grepping "20131002" and than lines below it, to search "Voice". And than prints all such patterns found.

joeyg · October 3, 2013, 2:52pm

So, there could be more than one pair of data?
20131002 and Voice

umarsatti · October 3, 2013, 3:04pm

Yes, sorry for the confusion, let me put it in more simple way.

Example:

20130210132030
A
B
C
Banana
20130210142320
D
E
F
Mango
20130210154634
G
H
I
Apple
20130210163415
J
K
L
Mango
20130210171829
M
N
O
Apple

So my script first grep pattern "20130210" once found, it should scan line below and scan for Apple, if Apple is found it should print. Output would look like below
"20130210154634 Apple"
"20130210171829 Apple"

Apple is printed to its corresponding "20130210" only, which is unique.

But if grep pattern "20130210" once found and it scans lines below and finds "Banana" or "Mango" it should exit and search for other grep pattern "20130210" and scan lines below it, as soon as it finds the first "Apple" it should exit and find other such patterns.

I hope its clear now?

BR/Umar

CarloM · October 3, 2013, 5:36pm

If your awk supports multi-character record separators, you could do something like:

$ awk '/Apple/ {printf RS $1 OFS "Apple\n"}' RS="20130210" fruit
20130210154634 Apple
20130210171829 Apple

jethrow · October 3, 2013, 9:31pm

while read line || [ -n "$line" ]; do
	case $line in
		\;*)		continue;;
		*20131002*)	stored=$line;;
		*Voice*)	echo "$stored $line";;
	esac
done < file

Jotne · October 4, 2013, 7:33am

Another awk version

awk '/20130210/ {p=$0} /Apple/ {print p,$0}' file
20130210154634 Apple
20130210171829 Apple

Akshay_Hegde · October 4, 2013, 7:40am

One more approach

$ cat t.txt
20130210132030
A
B
C
Banana
20130210142320
D
E
F
Mango
20130210154634
G
H
I
Apple
20130210163415
J
K
L
Mango
20130210171829
M
N
O
Apple

awk '{p=$0;getline;getline;getline;getline;if($0~/Apple/)printf p"\t"$0"\n";p=$0}' t.txt
20130210154634    Apple
20130210171829    Apple

disedorgue · October 4, 2013, 8:51am

Hi,
Just for fun (work under linux)

$ grep '20130210\|Apple' file | uniq -w8 | paste -d' ' - - | grep Apple
20130210132030 Apple
20130210163415 Apple

Regards.

RudiC · October 4, 2013, 12:59pm

Coming back to the original request, this might fulfill it:

awk '/^20131002/ {TMP=$0} /^Voice/ {print TMP" "$0}' file
20131002193434+0500 Voice
20131002193439+0500 Voice

disedorgue · October 4, 2013, 1:49pm

If date not found and Voice found then print Voice...

Regards.

RudiC · October 4, 2013, 1:56pm

Rats, you're right! Try (untested):

awk '/^20131002/ {TMP=$0} /^Voice/ && TMP {print TMP" "$0; TMP=""}' file

disedorgue · October 4, 2013, 2:30pm

It's work.
My sed contribution:

sed -n '/^20131002/{x;b;};/^Voice/{H;x;/^20131002/s/\n/ /p;}' file

Regards.