Extract various information from a log file

Hye ShamRock

If you can help me with this difficult task for me then it will save my day

Logs :

==================================================================================================================

============================================================================

This is modsecurity rules i need to add the rules ID mention in the logs for the particular domain and URL, but i am not able to write the script i am sure awk will help me here too which will give me domain name and the ID for which its got block with the URL

THANK in advance

So what is the output you expect.

I need
Domain : URL : and ID

Like
DOMAIN :domainname.com
URL : /im/qs_menu.php
ID : 950013

Which are shown in read [id "950013"]

:slight_smile: thanks for looking into it

How should it know to tie the URL from an error three things up with the URL-less error three things down?

---------- Post updated at 01:08 PM ---------- Previous update was at 12:52 PM ----------

Here's something that sort of does it:

$ cat get.awk
BEGIN { RS="";  FS="\n" }

{
        split($1, L, "[-]*");

        if(L[2] != LAST)
        {
                if(ID)
                {
                        print "dom:", DOM;
                        print "url:", URL;
                        print "id:", ID;
                        printf("\n");
                }
                DOM=""; URL=""; ID=""

                LAST=L[2];
        }


        for(N=1; N<=NF; N++)
        {
                if($N ~ /referer: http:/)       NEWDOM=$N
                if($N ~ /\[id /)                NEWID=$N
                if($N ~ /^GET/)                 NEWURL=$N
        }

        if(NEWURL)
        {
                split(NEWURL, a, "[ ?]");
                NEWURL=a[2];
                URL=NEWURL
                NEWURL=""
        }

        if(NEWID)
        {
                # Id string will be in a[2]
                split(NEWID, a, "\\[id ");
                # Split on ], ", ' ' chars.
                split(a[2], a, "[\"\\] ]");
                NEWID=a[2];
                ID=NEWID;
                NEWID=""
        }

        if(NEWDOM)
        {
                # Extract everything after 'referer:'
                split(NEWDOM, a, "referer: ");  NEWDOM=a[2];
                # Reduce http://whatever/ to whatever
                sub(/http:\/\//, "", NEWDOM);
                sub(/\/$/, "", NEWDOM);
                # Turn www.whatever.com into www, whatever, com.
                N=split(NEWDOM, a, ".");
                # Paste the last two together.
                NEWDOM=a[N-1];  NEWDOM=NEWDOM "." a[N];
                DOM=NEWDOM
                NEWDOM=""
        }
}

END {
        if(ID)
        {
                print "dom:", DOM;
                print "url:", URL;
                print "id:", ID;
        }
}
$ awk -f get.awk < data
dom: domainname.com
url: /im/qs_menu.php
id: 950013

dom: domainname.com
url: /im/qs_menu.php
id: 950013

$

I'm not sure how to remove the doubles, since I don't know what criteria they should or shouldn't be duplicated on.

This has fabulous work :smiley:

THANK YOU :smiley:

just please my log getting fill up with POST and GET methods too

the perfect example to try the script is

This need to be done and i am yours :slight_smile:

I know its will take mins for you to do it you are champ

Thanks

---------- Post updated at 08:49 PM ---------- Previous update was at 07:59 PM ----------

I managed to get it with "|" pipe

Now i need to know if can i just get the URL and ID(multiple IDs if it has multiple block id for same doamin) for specific domain name which i will provide from command line

may be like
[#]awk -f get.awk /usr/local/apache/logs/modsec_audit.log domain1.co.uk
URL : /wonder/all.php
ID : 910011 , 910023

is this possible ? if yes how ??

---------- Post updated 09-27-11 at 11:00 AM ---------- Previous update was 09-26-11 at 08:49 PM ----------

Please, Can any one from Unix team can help me with this ?