Extract various information from a log file

SilvesterJ · September 26, 2011, 12:44pm

Hye ShamRock

If you can help me with this difficult task for me then it will save my day

Logs :

==================================================================================================================

--f42e2544-A--
[26/Sep/2011:16:03:13 +0100] ToCUMdXlTpYAACTqNMsAAAAO 80.33.86.223 53424 91.186.30.249 80
--f42e2544-B--
GET /im/qs_menu.php?text=Contact%20Us&bt_img=bt_contact HTTP/1.1
Accept: */*
Referer: http://www.domainname.com/
Accept-Language: en-GB
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.1; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.1)
Accept-Encoding: gzip, deflate
Host: www.domainname.com
Connection: Keep-Alive
Cookie: PHPSESSID=f933fb642e1c3e258b7c9787b49d2408; lang=en

--f42e2544-F--
HTTP/1.1 406 Not Acceptable
Content-Length: 384
Keep-Alive: timeout=5, max=96
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1

--f42e2544-H--
Message: Access denied with code 406 (phase 2). Pattern match "_img|amature-big-titties|amature-big-titties|avril-laveign-porn|breast-touch-video|gingers-having-sex|naked-indian-models" at REQUEST_URI. [file "/usr/local/apache/conf/modsec2.user.conf"] [line "109"] [id "950013"] [msg "PHP/FTP Injection Attack. Matched signature <_img>"] [severity "CRITICAL"]
Apache-Error: [file "core.c"] [line 3650] [level 3] File does not exist: /home/costadel/domains/domainname.com/public_html/406.shtml, referer: http://www.domainname.com/
Action: Intercepted (phase 2)
Stopwatch: 1317049393646593 1950 (402 1648 -)
Producer: ModSecurity for Apache/2.5.13 (GitHub - SpiderLabs/ModSecurity: ModSecurity is an open source, cross platform web application firewall (WAF) engine for Apache, IIS and Nginx that is developed by Trustwave's SpiderLabs. It has a robust event-based programming language which provides protection from a range of attacks against web applications and allows for HTTP traffic monitoring, logging and real-time analysis. With over 10,000 deployments world-wide, ModSecurity is the most widely deployed WAF in existence.).

--f42e2544-Z--

--2ed66772-A--
[26/Sep/2011:16:03:14 +0100] ToCUMtXlTpYAACTqNMwAAAAO 80.33.86.223 53424 91.186.30.249 80
--2ed66772-B--
GET /im/qs_menu.php?text=Map&bt_img=bt_map HTTP/1.1
Accept: */*
Referer: http://www.domainname.com/
Accept-Language: en-GB
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.1; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.1)
Accept-Encoding: gzip, deflate
Host: www.domainname.com
Connection: Keep-Alive
Cookie: PHPSESSID=f933fb642e1c3e258b7c9787b49d2408; lang=en

--2ed66772-F--
HTTP/1.1 406 Not Acceptable
Content-Length: 384
Keep-Alive: timeout=5, max=95
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1

--2ed66772-H--
Message: Access denied with code 406 (phase 2). Pattern match "_img|amature-big-titties|amature-big-titties|avril-laveign-porn|breast-touch-video|gingers-having-sex|naked-indian-models" at REQUEST_URI. [file "/usr/local/apache/conf/modsec2.user.conf"] [line "109"] [id "950013"] [msg "PHP/FTP Injection Attack. Matched signature <_img>"] [severity "CRITICAL"]
Apache-Error: [file "core.c"] [line 3650] [level 3] File does not exist: /home/costadel/domains/domainname.com/public_html/406.shtml, referer: http://www.domainname.com/
Action: Intercepted (phase 2)
Stopwatch: 1317049394307033 2032 (448 1733 -)
Producer: ModSecurity for Apache/2.5.13 (GitHub - SpiderLabs/ModSecurity: ModSecurity is an open source, cross platform web application firewall (WAF) engine for Apache, IIS and Nginx that is developed by Trustwave's SpiderLabs. It has a robust event-based programming language which provides protection from a range of attacks against web applications and allows for HTTP traffic monitoring, logging and real-time analysis. With over 10,000 deployments world-wide, ModSecurity is the most widely deployed WAF in existence.).
Server: Apache

============================================================================

This is modsecurity rules i need to add the rules ID mention in the logs for the particular domain and URL, but i am not able to write the script i am sure awk will help me here too which will give me domain name and the ID for which its got block with the URL

THANK in advance

shamrock · September 26, 2011, 1:00pm

So what is the output you expect.

SilvesterJ · September 26, 2011, 1:48pm

I need
Domain : URL : and ID

Like
DOMAIN :domainname.com
URL : /im/qs_menu.php
ID : 950013

Which are shown in read [id "950013"]

thanks for looking into it

Corona688 · September 26, 2011, 3:08pm

How should it know to tie the URL from an error three things up with the URL-less error three things down?

---------- Post updated at 01:08 PM ---------- Previous update was at 12:52 PM ----------

Here's something that sort of does it:

$ cat get.awk
BEGIN { RS="";  FS="\n" }

{
        split($1, L, "[-]*");

        if(L[2] != LAST)
        {
                if(ID)
                {
                        print "dom:", DOM;
                        print "url:", URL;
                        print "id:", ID;
                        printf("\n");
                }
                DOM=""; URL=""; ID=""

                LAST=L[2];
        }


        for(N=1; N<=NF; N++)
        {
                if($N ~ /referer: http:/)       NEWDOM=$N
                if($N ~ /\[id /)                NEWID=$N
                if($N ~ /^GET/)                 NEWURL=$N
        }

        if(NEWURL)
        {
                split(NEWURL, a, "[ ?]");
                NEWURL=a[2];
                URL=NEWURL
                NEWURL=""
        }

        if(NEWID)
        {
                # Id string will be in a[2]
                split(NEWID, a, "\\[id ");
                # Split on ], ", ' ' chars.
                split(a[2], a, "[\"\\] ]");
                NEWID=a[2];
                ID=NEWID;
                NEWID=""
        }

        if(NEWDOM)
        {
                # Extract everything after 'referer:'
                split(NEWDOM, a, "referer: ");  NEWDOM=a[2];
                # Reduce http://whatever/ to whatever
                sub(/http:\/\//, "", NEWDOM);
                sub(/\/$/, "", NEWDOM);
                # Turn www.whatever.com into www, whatever, com.
                N=split(NEWDOM, a, ".");
                # Paste the last two together.
                NEWDOM=a[N-1];  NEWDOM=NEWDOM "." a[N];
                DOM=NEWDOM
                NEWDOM=""
        }
}

END {
        if(ID)
        {
                print "dom:", DOM;
                print "url:", URL;
                print "id:", ID;
        }
}
$ awk -f get.awk < data
dom: domainname.com
url: /im/qs_menu.php
id: 950013

dom: domainname.com
url: /im/qs_menu.php
id: 950013

$

I'm not sure how to remove the doubles, since I don't know what criteria they should or shouldn't be duplicated on.

SilvesterJ · September 27, 2011, 12:00pm

This has fabulous work

THANK YOU

just please my log getting fill up with POST and GET methods too

the perfect example to try the script is

--fecb387d-A--
[27/Sep/2011:01:04:14 +0100] ToES-dXlfQYAAGD-UgsAAAAn 209.172.61.41 58098 109.75.170.170 80
--fecb387d-B--
POST /xmlrpc.php HTTP/1.0
User-Agent: The Incutio XML-RPC PHP Library -- WordPress/3.2.1
Host: www.domainname.co.uk
Accept: */*
Content-Type: text/xml
Accept-Encoding: deflate;q=1.0, compress;q=0.5
Content-Length: 359

--fecb387d-C--
<?xml version="1.0"?>
<methodCall>
<methodName>pingback.ping</methodName>
<params>
<param><value><string>http://www.domain.com/relationships/relationships-weddings/trinkets-perfect-presents-for-a-wedding/</string></value></param>
<param><value><string>domainname.co.uk - This website is for sale! - domainname Resources and Information.;
</params></methodCall>
--fecb387d-F--
HTTP/1.1 404 Not Found
X-Powered-By: PHP/5.2.17
X-Pingback: domainname.co.uk - This website is for sale! - domainname Resources and Information.
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Last-Modified: Tue, 27 Sep 2011 00:04:14 GMT
Connection: close
Content-Type: text/html; charset=UTF-8

--fecb387d-H--
Message: Access denied with code 406 (phase 2). Match of "rx (^application/x-www-form-urlencoded|^multipart/form-data;).$" against "REQUEST_HEADERS:Content-Type" required. [file "/usr/local/apache/conf/modsec2.user.conf"] [line "14"] [id "90111"]
Action: Intercepted (phase 2)
Stopwatch: 1317081853500200 736240 (2276 2408 -)
Producer: ModSecurity for Apache/2.5.13 (ModSecurity: Open Source Web Application Firewall).

--f42e2544-A--
[26/Sep/2011:16:03:13 +0100] ToCUMdXlTpYAACTqNMsAAAAO 80.33.86.223 53424 91.186.30.249 80
--f42e2544-B--
GET /im/qs_menu.php?text=Contact%20Us&bt_img=bt_contact HTTP/1.1
Accept: */*
Referer: http://www.domainname.com/
Accept-Language: en-GB
User-Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; Trident/4.0; GTB7.1; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; InfoPath.1)
Accept-Encoding: gzip, deflate
Host: www.domainname.com
Connection: Keep-Alive
Cookie: PHPSESSID=f933fb642e1c3e258b7c9787b49d2408; lang=en

--f42e2544-F--
HTTP/1.1 406 Not Acceptable
Content-Length: 384
Keep-Alive: timeout=5, max=96
Connection: Keep-Alive
Content-Type: text/html; charset=iso-8859-1

--f42e2544-H--
Message: Access denied with code 406 (phase 2). Pattern match "_img|amature-big-titties|amature-big-titties|avril-laveign-porn|breast-touch-video|gingers-having-sex|naked-indian-models" at REQUEST_URI. [file "/usr/local/apache/conf/modsec2.user.conf"] [line "109"] [id "950013"] [msg "PHP/FTP Injection Attack. Matched signature <_img>"] [severity "CRITICAL"]
Apache-Error: [file "core.c"] [line 3650] [level 3] File does not exist: /home/costadel/domains/domainname.com/public_html/406.shtml, referer: http://www.domainname.com/
Action: Intercepted (phase 2)
Stopwatch: 1317049393646593 1950 (402 1648 -)
Producer: ModSecurity for Apache/2.5.13 (GitHub - SpiderLabs/ModSecurity: ModSecurity is an open source, cross platform web application firewall (WAF) engine for Apache, IIS and Nginx that is developed by Trustwave's SpiderLabs. It has a robust event-based programming language which provides protection from a range of attacks against web applications and allows for HTTP traffic monitoring, logging and real-time analysis. With over 10,000 deployments world-wide, ModSecurity is the most widely deployed WAF in existence.).

This need to be done and i am yours

I know its will take mins for you to do it you are champ

Thanks

---------- Post updated at 08:49 PM ---------- Previous update was at 07:59 PM ----------

I managed to get it with "|" pipe

Now i need to know if can i just get the URL and ID(multiple IDs if it has multiple block id for same doamin) for specific domain name which i will provide from command line

may be like
[#]awk -f get.awk /usr/local/apache/logs/modsec_audit.log domain1.co.uk
URL : /wonder/all.php
ID : 910011 , 910023

is this possible ? if yes how ??

---------- Post updated 09-27-11 at 11:00 AM ---------- Previous update was 09-26-11 at 08:49 PM ----------

Please, Can any one from Unix team can help me with this ?