Grep patterns and group counts

Hi,

I have a continuous log file which has the following format:-

02/Sep/2015: IP 11.151.108.166 error occurred etc
03/Sep/2015: IP 11.151.108.188 error occurred etc
03/Sep/2015: IP 11.152.178.250 error occurred etc
03/Sep/2015: IP 11.188.108.176 error occurred etc
03/Sep/2015: IP 11.14.108.146 error occurred etc
03/Sep/2015: IP 11.188.178.1 error occurred etc
03/Sep/2015: IP 11.151.188.142 error occurred etc
03/Sep/2015: IP 11.151.21.188 error occurred etc

I want to create a daily report of total counts of IPs grouped by the first two sections of the IP eg 11.151 etc. So, for the above it would show:-

IP total counts for 03/Sep/2015:-

11.151 = 3 (entries)
11.152 = 1
11.188 = 2
11.14 = 1
etc

I think i'm okay doing the count by the date but i'm not sure how to grep the pattern of the IP.
I do know which IP ranges are in the log but would rather not hard code each individual pattern (e.g. grep 11.151*, grep 11.152* etc) in case new ones appear.

Thanks

Hello finn,

Following may help you in same, if order doesn't matter for you.

1st:
awk '{split($3, A,".");B[A[1]"."A[2]]++} END{for(i in B){print i OFS B}}' OFS=" = "  Input_file
OR for specific date like you mentioned in you post
2nd:
awk '{if($1 ~ /^03\/Sep\/2015:$/){split($3, A,".");B[A[1]"."A[2]]++}} END{for(i in B){print i OFS B}}' OFS=" = "  Input_file
 

Output will be as follows for without date in 1st command above.

11.14 = 1
11.188 = 2
11.151 = 4
11.152 = 1

Output will be as follows with date mentioned in 2nd command above.

11.14 = 1
11.188 = 2
11.151 = 3
11.152 = 1
 

Thanks,
R. Singh

1 Like
[akshay@localhost tmp]$ cat file
02/Sep/2015: IP 11.151.108.166 error occurred etc
03/Sep/2015: IP 11.151.108.188 error occurred etc
03/Sep/2015: IP 11.152.178.250 error occurred etc
03/Sep/2015: IP 11.188.108.176 error occurred etc
03/Sep/2015: IP 11.14.108.146 error occurred etc
03/Sep/2015: IP 11.188.178.1 error occurred etc
03/Sep/2015: IP 11.151.188.142 error occurred etc
03/Sep/2015: IP 11.151.21.188 error occurred etc
[akshay@localhost tmp]$ awk -F'[ .]' '$1==dt":"{A[$3"."$4]++}END{for(i in A)print i" = "A}' dt="03/Sep/2015" file
11.14 = 1
11.188 = 2
11.151 = 3
11.152 = 1
1 Like

Perhaps a bit of Perl?

#!/usr/bin/perl
# finn.report.pl

# these two statements help detecting bugs
use strict;
use warnings;

my %dates; # to keep a record by dates

# read the file given at the command line
while(<>) {
    # extract date and ip, disregard rest 
    my ($day, undef, $nbits, undef) = split; 
    # extract ip bits wanted
    my ($netbits) = $nbits =~ /^(\d+\.\d+)/;
    # increase count of ip bits
    $dates{$day}{$netbits}++;
}

# report back to stdout
for my $d (sort keys %dates){
    print "IP total counts for $d\n";
    for my $ipp (sort keys %{$dates{$d}}){
        printf "%-8s = %s\n", $ipp, $dates{$d}{$ipp};
    }
    print "\n";
}

Run as

$ perl finn.report.pl finn.file 
IP total counts for 02/Sep/2015:
11.151   = 1

IP total counts for 03/Sep/2015:
11.14    = 1
11.151   = 3
11.152   = 1
11.188   = 2
1 Like

Thanks all - some great solutions there that work. That AWK is a really powerful tool.