Hi all
I've been working on a bash script parsing through debug/trace files and extracting all lines that relate to some search string. So far, it works pretty well. However, I am challenged by one requirement that is still open.
What I want to do:
1) parse through a file and identify all packet numbers (PXXX:) that match my search, hereafter called "interesting packets"
2) parse again through the same file, searching this time now for packets that relate to the packets identified in step 1)
See note around P3712451 in the example below!
3) what I would also like to get are related log messages that may appear just underneath a interesting packet. Any other log message should be ignored.
4) output all log file lines that somehow relate to the searched string into another file.
Example trace file (simplified):
12/14/2009 21:16:03: P3712446: Packet received from 10.10.10.1
12/14/2009 21:16:03: P3712446: Trace of Accounting-Request packet
12/14/2009 21:16:03: P3712446: identifier = 33
12/14/2009 21:16:03: P3712446: length = 435
12/14/2009 21:16:03: P3712446: NAS-Port = 1
12/14/2009 21:16:03: P3712446: Service-Type = Framed
12/14/2009 21:16:03: P3712446: Framed-Protocol = PPP
12/14/2009 21:16:03: P3712446: NAS-Port-Type = Virtual
12/14/2009 21:16:03: P3712446: User-Name = testuser
12/14/2009 21:16:03: P3712446-2: Creating proxy request P3712451 to send to RemoteServer rsAAA1 (11.11.11.11) <==== P3712451 is related to P3712446
12/14/2009 21:16:03: P3712451: Trace of Accounting-Request packet
12/14/2009 21:16:03: P3712451: identifier = 33
12/14/2009 21:16:03: P3712451: length = 435
12/14/2009 21:16:03: P3712451: NAS-Port = 1
12/14/2009 21:16:03: P3712451: Service-Type = Framed
12/14/2009 21:16:03: P3712451: Framed-Protocol = PPP
12/14/2009 21:16:03: P3712451: NAS-Port-Type = Virtual
12/14/2009 21:16:03: P3712451: User-Name = testuser
12/14/2009 21:16:04: P3712460: Packet received from 11.11.11.11
12/14/2009 21:16:04: Log: Positive response received from 11.11.11.11 <===== log message that should be captured as well
12/14/2009 21:16:04: P3712446-2: Creating response from proxy response P3712460
12/14/2009 21:16:04: P3712446-2: Sub-service REMOTEAAA accepted request
12/14/2009 21:16:04: P3712446: All sub-services accepted the request
12/14/2009 21:16:04: P3712446: Trace of Accounting-Response packet
12/14/2009 21:16:04: P3712446: identifier = 33
12/14/2009 21:16:04: P3712446: length = 20
12/14/2009 21:16:04: P3712446: Sending response to 10.10.10.1
Step 1), 2) and 4) are already working using egrep.
Step 1)
PACKETS=$(egrep -i $QUERYSTRING $TRACEFILE | grep -v ": Log:" | sed -e "s/^[^P]*P/P/;s/\:.*//" | sort | uniq | tr '\n' '|')
PACKETS=$(echo $PACKETS | sed -e "s/|$//")
The above fills $PACKETS with interesting packets matching the $QUERYSTRING (e.g. testuser) in the form "(P3712446|P3712451|P3712460)"
Step 2)
PACKETS=$(egrep "($PACKETS)( |:|$)" $TRACEFILE | grep -v ": Log:" | sed -e "s/^[^P]*P/P/;s/\:.*//" | sort | uniq | tr '\n' '|')
PACKETS=$(echo $PACKETS | sed -e "s/|$//")
Step 4)
Finally, I write the interesting packets into a new file using the following
egrep "($PACKETS)( |:|$)" $TRACEFILE >> $RESULTFILE
I've got 2 questions now:
Q1) How can I catch Log lines like...
12/14/2009 21:16:04: Log: Positive response received from 11.11.11.11
...if it follows an interesting packet and ignore any other Log line?
I've been looking at multiple line matching examples... but I am not able to apply what I've seen in combination with the sometimes huge list of interesting packets I've got.
Q2) Any obvious and easy way to simplify what I've done already?
I started with parsing each line... but that was far too time consuming (1h+). The above still takes 2-3 minutes for a 130MB file, which is ok. But maybe someone has even something faster on his mind.
Many thanks,
Ren�