Parse log files

tandrei · March 17, 2015, 7:58pm

Hi all,

We are having a sample log like

....
test.log:2015.03.17 06:16:24 >> ABC.generateMethod() MethodAException while processing Request! DataForm: Header ---  dtd: template.dtd, titleName: berger, requestId: 1503170032131, documentName: invoice123, hostName: acme.net, userName: userABC -- Media ---  mediaType: XML, serverName: ABCServer1 -- data section --- ...
.....

We would like to create a bash script that selects lines having documentName string and for each line print out values for: titleName, documentName and serverName

Don_Cragun · March 17, 2015, 8:51pm

Please show us explicitly the output you want to generate from this line in your log file.

What operating system are you using?

What shell are you using?

What have you tried to solve this problem on your own?

tandrei · March 18, 2015, 3:23am

Hi Don,

The output desired would be

titleName: berger
documentName: invoice123
serverName: ABCServer1

The code we used to start selecting lines from the log:

#!/bin/bash
fname = test.log
pattern = documentName
if [ -f "$fname" ]
then
    result=$(grep -i "$pattern" "$fname")
    echo "$result"
fi

Don_Cragun · March 18, 2015, 5:15am

You could try something like:

#!/bin/bash
awk -F', | -{2,3} {1,2}' '
/documentName:/ {
	for(i = 1; i <= NF; i++)
		if($i ~ /^titleName:/)		tn = $i
		else if($i ~ /^documentName:/)	dc = $i
		else if($i ~ /^serverName:/)	sn = $i
	printf("%s\n%s\n%s\n", tn, dc, sn)
}' logfile

which, with your sample log file produces the output:

titleName: berger
documentName: invoice123
serverName: ABCServer1

You didn't answer my question about what OS you're using. If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk .

RudiC · March 18, 2015, 7:39am

Try also

awk     'BEGIN                  {n=split("titleName documentName serverName", SEAR)}
         /documentName/         {for (i=1; i<=n; i++)   {match ($0, SEAR": [^, ]*")
                                                         print substr($0, RSTART, RLENGTH)}
                                }
        ' file
titleName: berger
documentName: invoice123
serverName: ABCServer1

tandrei · March 18, 2015, 8:42am

Thanks, guys! Both worked fine.

---------- Post updated at 07:42 AM ---------- Previous update was at 07:29 AM ----------

One more thing I just realized, for retrieving also the information about the file and timestamp, like:

test.log:2015.03.17 06:16:24
titleName: berger
documentName: invoice123
serverName: ABCServer1

how can I modify the script? - it is a Linux Box

RudiC · March 18, 2015, 8:47am

I'd propose we leave this up to you, as an exercise... you may want to print out $1 in awk .

tandrei · March 18, 2015, 9:25am

Hi Rudi,

That was the obvious choice, you are right. But in this case we need to stick to the content itself, not to the file variable.
What was requested is that also the file+timestamp information (before >>) is added.

Regards,
Andrei

sea · March 18, 2015, 9:45am

An awk $1 is not the same as a script (bash, ksh, csh...) $1 .
While $1 is a variable within all scripts, $1 is the first field of the currently parsed inputline.

hth

Don_Cragun · March 18, 2015, 6:03pm

Hi sea,
Just so we don't confuse tandrei too much, I think you have a typo above. In awk , $1 is the first field of the current input line; not the first column in the output.

With RudiC's script, tandrei will probably need to look at $1 and $2 (or use a different substr(...) after matching the file and timestamp terminator ( > )). With my script, $1 will be sufficient, but FS ERE will need to be updated to add >> as another field separator.

sea · March 18, 2015, 6:51pm

...lost in translation... post updated