Extract & Manipulate continous data stream-- tcpdump

Hello;

I have this rather tricky problem to solve --(to me, anyways) ..

I am processing the following one liner with tcpdump..

 tcpdump -i T3501 -A ether host 00:1e:49:29:fc:c9 or ether host 00:1b:2b:86:ec:1b or ether host 00:21:1c:98:a4:08 and net 149.83.6.0/24 | grep --line-buffered -B 20 IBM-32 | awk '/IBM-32/ || ( /IP/ && /Flags \[\.\]\, ack/ )'

The output looks as follows:

11:55:33.824133 IP 167.26.199.44.playsta2-app > 149.83.6.1.8023: Flags [.], ack 2792, win 63227, length 0
11:55:33.825247 IP 167.26.227.168.4693 > 149.83.6.1.8023: Flags [.], ack 5307, win 64512, length 0
11:55:33.826140 IP 168.108.221.122.57406 > 149.83.6.64.8023: Flags [.], ack 1274289, win 513, length 0
11:55:33.826355 IP 168.108.220.104.50909 > 149.83.6.64.8023: Flags [.], ack 1531837, win 256, length 0
11:55:33.829913 IP 199.198.231.57.58935 > 149.83.6.64.8023: Flags [.], ack 111302, win 64512, length 0
E@.D.!@.t........S.@.$.W79p....dP.........(..IBM-3278-2-E.CC218085..
11:55:33.845867 IP 199.198.231.57.34945 > 149.83.6.128.8023: Flags [.], ack 1064, win 63449, length 0
E..D....9....S.@.....W.$...d79p,P.........(..IBM-3278-2-E.CC218085..
11:55:53.395263 IP 199.198.231.57.10464 > 149.83.6.64.8023: Flags [.], ack 16186, win 64512, length 0
11:55:53.400435 IP 168.108.220.104.50909 > 149.83.6.64.8023: Flags [.], ack 2096906, win 256, length 0
E@.D..@.t..R.....S.....Wi!.4.$.8P.........(..IBM-3278-2-E.CC210147..
11:55:53.417919 IP 167.26.104.157.stat-scanner > 149.83.6.64.8023: Flags [.], ack 15970, win 64512, length 0
11:55:53.418914 IP 168.108.221.122.57407 > 149.83.6.64.8023: Flags [.], ack 40988, win 509, length 0
11:55:53.425586 IP 199.198.231.57.10498 > 149.83.6.64.8023: Flags [.], ack 274360, win 63452, length 0
11:55:53.431282 IP 168.108.221.122.57406 > 149.83.6.64.8023: Flags [.], ack 1739414, win 513, length 0
E..DhC..9..e.S.......W...$.8i!.PP.........(..IBM-3278-2-E.CC210147..

I need to extract, for each unique IBM-3278 expr, the previous ip_addr before the ">" sign..so that hopefilly end up with, e.g :

IBM-3278-2-E.CC210147,  IP 168.108.221.122.57407, IP 199.198.231.57.10498, IP 168.108.221.122.57406

So I tried reversing the output with "tac" in the end but nothing happened

Then I thought of using csplit with the IBM as the delimiter but its compaling ..

 tcpdump -i T3501 -A ether host 00:1e:49:29:fc:c9 or ether host 00:1b:2b:86:ec:1b or ether host 00:21:1c:98:a4:08 and net 149.83.6.0/24 | grep --line-buffered -B 20 IBM-32 | awk '/IBM-32/ || ( /IP/ && /Flags \[\.\]\, ack/ )'|xargs csplit /IBM/
xargs: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
csplit: cannot open `/IBM/' for reading: No such file or directory

so that I can process the split files using a cron job or a daemon..

Any ideas are apptreciated.. Thank you

let's just save each "host > host" line to a variable, and print it when we find IBM-32?
i'm assuming we can find those because they start with a time vs whitespace

tcpdump | awk '
  /^[0-9]*:[0-9]*/ {src=$3}
  match($0, /IBM-32[^.]*\.[^.]*/) {
    str=substr($0,RSTART,RLENGTH)]
    ibm[str]=src,ibm[str]
  }
  END {
    for (i in ibm) print i, ibm
  }
'

This uses END so you don't get an output until it's DONE which wouldn't really work for realtime output from tcpdump.

$ cat tcpdump | awk '/^[0-9]*:[0-9]*/ {src=$3} match($0, /BM-32[^.]*\.[^.]*/) { str=substr($0,RSTART,RLENGTH); ibm[str]=src FS ibm[str] } END { for (i in ibm)  { print i, ibm } }'
BM-3278-2-E.CC218085 199.198.231.57.34945 199.198.231.57.58935
BM-3278-2-E.CC210147 168.108.221.122.57406 168.108.220.104.50909

some of them were missing the I in your output here

Thank you very much but I am not getting any output..

 tcpdump | awk '/^[0-9]*:[0-9]*/ {src=$3} match($0, /IBM-32[^.]*\.[^.]*/) { str=substr($0,RSTART,RLENGTH);\
>  ibm[str]=src FS ibm[str] } END { for (i in ibm)  { print i, ibm } }'
tcpdump: WARNING: Shared: no IPv4 address assigned
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on Shared, link-type EN10MB (Ethernet), capture size 65535 bytes

Just as a matter of interest there are 4 off "BM-3278" of which 2 are "IBM-3278".

Do we take it that the other two "BM-3278" parts are not required?
OR...
Are all 4 required?

sorry .. that was my typo.. all should be "IBM-3278"

Thnx

---------- Post updated 04-15-14 at 10:38 AM ---------- Previous update was 04-14-14 at 07:04 PM ----------

ok so I made some headway ..added bit more awk filtering:

tcpdump -i T3501 -A ether host 00:1e:49:29:fc:c9 or ether host 00:1b:2b:86:ec:1b or ether host 00:21:1c:98:a4:08 and net 149.83.6.0/24 | grep --line-buffered -B 20 IBM-32 | awk '/IBM-32/ || ( /IP/ && /Flags \[\.\]\, ack/ )' | awk '/IBM/ {print $0} /IP/ {print "Source-IP= "$3}'

and the output is like:

Source-IP= 199.198.231.57.59033
E@.D..@.p.g......S.@.G.W$.....W{P....^....(..IBM-3278-2-E.CDC13117..
Source-IP= 168.108.167.244.50411
Source-IP= 168.108.167.244.50413
E..DL0..9....S.@.....W.G..W{$...P....9....(..IBM-3278-2-E.CDC13117..
Source-IP= 199.198.231.57.34947
Source-IP= 168.108.167.244.50411

So my nextt task is, how to do further filtering so that I end up with, e.g.:

Source-IP= 199.198.231.57.59033
IBM-3278-2-E.CDC13117..
Source-IP= 168.108.167.244.50411
Source-IP= 168.108.167.244.50413
IBM-3278-2-E.CDC13117..
Source-IP= 199.198.231.57.34947
Source-IP= 168.108.167.244.50411

Thnx

finally solved it..

tcpdump -i T3501 -A ether host 00:1e:49:29:fc:c9 or ether host 00:1b:2b:86:ec:1b or ether host 00:21:1c:98:a4:08 and net 149.83.6.0/24 \
| grep --line-buffered -B 20 IBM-32 | awk '/IBM-32/ || ( /IP/ && /Flags \[\.\]\, ack/ )' \
| awk '/IBM/{ split($0,A,"IBM");   system("date");  print "Term-ID= IBM-"A[2] }   /IP/{ print "Source-IP= "$3 }'
Wed Apr 16 11:19:58 EDT 2014
Term-ID= IBM--3278-2-E.CC214070..
Source-IP= 199.198.231.57.12596
Source-IP= 168.108.167.244.60976
Source-IP= 199.198.231.57.59263
Wed Apr 16 11:19:58 EDT 2014
Term-ID= IBM--3278-2-E.CC214070..
Source-IP= 168.108.220.104.57107
Source-IP= 168.108.167.244.60976
Source-IP= 168.108.221.122.49326
Source-IP= 167.26.185.245.krb5gatekeeper
Wed Apr 16 11:19:58 EDT 2014
Term-ID= IBM--3278-2-E.CDC06151..
Wed Apr 16 11:19:58 EDT 2014
Term-ID= IBM--3278-2-E.CDC06151..
Source-IP= 168.108.167.244.60976
Wed Apr 16 11:19:58 EDT 2014
Term-ID= IBM--3278-2-E.cdc18155..
Source-IP= 168.108.220.104.57107
Source-IP= 168.108.167.244.60976
Source-IP= 168.108.167.244.60980