Pulling out entries from file based on IP - awk

i have several lines in a file that looks like this:

2013-03-23 19:02:33,122 DiscoverManager [Discover-15] WeblogicApplication-10.111.112.119-7400 FAILURE 

i'm monitoring this file for strings that contain FAILURE. but i've been getting a lot of alerts that just aren't actionable. so i need another way around this.

each line has an IP and a port number, which is what i bolded above. i'm wondering, is there a way to do something like this:

  • if 10 or more entries for any specific IP are found in the log, AND each of the entry is for different port numbers, then alert?

my problem is, there isn't a list of known IPs or ports. any IP can be thrown in the file. so i'm curious if what i'm thinking is possible? i'm guessing awk can be used for this?

i was using a variation of this command to get a count:

awk '/FAILURE/ && /FAILURE/ {++c}c>=10 -1{o=$0 RS $0 RS $0; print o; c=0}' file

and i was using this to show me the actual offending lines from the file:

awk '/FAILURE/ && /FAILURE/' file

You could do something like:

awk ' /FAILURE/ {
        match( $0, /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\-[0-9]+/ )
        IP = sprintf ("%s", substr($0, RSTART, RLENGTH))
        if (!(IP in A ))
        {
                A[IP]++
                ++c
        }
        if ( c == 10 )
        {
                print c " unique IPs found in FAILURE"
                for ( i in A )
                        print i
                exit 1
        }

} ' file

May I ask what do you expect from testing against the same condition twice?

Try:

awk -F- '/FAILURE/{if(!seen[$5,$6]){cnt[$5]++; seen[$5,$6]++; }  if(cnt[$5]>10) print $5,$6}' logfile

Note that this assumes your log file is consistent with respect to number of dashes.
If not, then you can match, as yoda showed, or you can pre-process, something along the lines:

awk '/FAILURE/{print $(NF-1), $NF) logfile | awk -F- '{if(!seen[$2,$3]){cnt[$2]++; seen[$2,$3]++; }  if(cnt[$2]>10) print $2,$3}' > IPs.list

And then post-process the captured IPs.list to grep for them in the orig file.

1 Like

I tried your second awk, and it appears that it is assuming the IPs in the log file will always be a specific column. that's accurate. and its my fault for not pointing that out.

the ips can be in any column. so i'm not sure how you'd tweak your awk statement to do that. btw, there was a missing "}" in it, so i changed it to this:

awk '/FAILURE/{print $(NF-1), $NF}' lfile | awk -F- '{if(seen[$2,$3]){cnt[$2]++; seen[$2,$3]++; }  if(cnt[$2]>1) print $2,$3}'

as you can see, i dropped it from 10 to 1, but still, nothing was spat out from the log even though there should have been.

---------- Post updated at 09:07 AM ---------- Previous update was at 09:03 AM ----------

looks like this could work. when i ran your command i got something like this:

10 unique IPs found in FAILURE
10.51.65.16-1521

10.50.44.188-1521
10.51.111.22-7443
10.51.101.29-1443
10.58.60.21-1521
10.51.19.12-7443
10.11.34.23-1443
10.51.19.15-80
10.51.65.17-1521

the script is suppose to only count IPs that occur at least 10 times in the log file. looks like this is counting all IPs. and i'm not sure where it is applying the threshold of 10. but it looks like this could very well work.

also, notice the blank line. not sure what that means.

I misread your requirement! To find an IP that has 10 or more occurrence:

awk ' /FAILURE/ {
        match( $0, /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\-[0-9]+/ )
        IP = sprintf ("%s", substr($0, RSTART, RLENGTH))
        A[IP]++
} END {
        for ( ip in A )
        {
                if ( A[ip] >= 10 )
                        print "IP: " ip " found " A[ip] " times in the log"
        }
} ' file

BTW that blank line might be due to a record with pattern: FAILURE but no IP address in it.

1 Like

perfect. this works as expected. i have one more need. the above tells you how many times an ip is found in the file if the port number of the IP is the same.

but i'd also like to alert if an IP is found in the log with different port numbers.

for instance, your code currently alerts on this type of scenario:

2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.20.111-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax

notice the port numbers are all the same.

can it be tweaked to also alert when the ports are different for any of the IPs and the number all these occurrences is 15 or more?:

2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1920 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1321 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1822 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1960 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1023 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1420 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1930 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1920 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1290 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
2013-03-23 13:05:55,987  [-46] OracleSensor-10.19.24.164-1207 ERROR util.URIUtils - [PLATFORM.UTIL.E.15] URI syntax
awk ' /FAILURE/ {
        match( $0, /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/ )
        IP = sprintf ("%s", substr($0, RSTART, RLENGTH))
        match( $0, /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+\-[0-9]+/ )
        IP_PO = sprintf ("%s", substr($0, RSTART, RLENGTH))
        A_IP[IP]++
        A_IP_PO[IP_PO]++
} END {
        for ( ip in A_IP )
        {
                if ( A_IP[ip] >= 10 )
                {
                        print "IP: " ip " found " A_IP[ip] " times in the log"
                        for ( ip_po in A_IP_PO )
                        {
                                split(ip_po, V, "-")
                                if ( ip == V[1] )
                                        print "IP & PORT: " ip_po " Number of occurrences: " A_IP_PO[ip_po]
                        }
                }
        }
} ' file
1 Like