Help to create a script to parse log files

Hello everybody,

I need some help here to create a script to parse a log file. Here is a sample of the log file :

[0720 16:00:00.646257] 0x42258940 (Debug) Cache SUMMARY [s1.meta] attrs now/668 min/668 max/668.
[0720 16:00:00.646262] 0x42258940 (Debug) RSVD SUMMARY [s1.meta] reserved space max requested/128 MB accounted now/0 MB
[0720 16:00:00.646273] 0x42258940 (Debug) VOP SUMMARY [s1.meta] Setattr cnt/14 avg/53+54369 min/13+42 max/334+259525.
[0720 16:00:00.646283] 0x42258940 (Debug) VOP SUMMARY [s1.meta] Open cnt/7 avg/24+187 min/19+175 max/34+231.
[0720 16:00:00.646291] 0x42258940 (Debug) VOP SUMMARY [s1.meta] Close cnt/7 avg/20+37 min/16+29 max/28+54.
[0720 16:00:00.646300] 0x42258940 (Debug) VOP SUMMARY [s1.meta] GetResyncAttr cnt/1 avg/20+25 min/20+25 max/20+25.
[0720 16:00:00.646305] 0x42258940 (Debug) Cache SUMMARY [s3.meta] attrs now/4 min/4 max/4.
[0720 16:00:00.646309] 0x42258940 (Debug) RSVD SUMMARY [s3.meta] reserved space max requested/32 MB accounted now/0 MB
[0720 16:00:00.646320] 0x42258940 (Debug) VOP SUMMARY [s3.meta] Setattr cnt/12 avg/31+63182 min/17+39 max/150+334001.
[0720 16:00:00.646329] 0x42258940 (Debug) VOP SUMMARY [s3.meta] Open cnt/6 avg/22+180 min/18+159 max/31+210.
[0720 16:00:00.646337] 0x42258940 (Debug) VOP SUMMARY [s3.meta] Close cnt/6 avg/19+31 min/17+25 max/29+37.
[0720 16:00:00.646346] 0x42258940 (Debug) VOP SUMMARY [s3.meta] GetResyncAttr cnt/1 avg/19+24 min/19+24 max/19+24.
[0720 16:00:00.646351] 0x42258940 (Debug) Cache SUMMARY [s4.meta] attrs now/387 min/85 max/387.
[0720 16:00:00.646355] 0x42258940 (Debug) RSVD SUMMARY [s4.meta] reserved space max requested/128 MB accounted now/128 MB
[0720 16:00:00.646366] 0x42258940 (Debug) VOP SUMMARY [s4.meta] Setattr cnt/7838 avg/23+41116 min/10+22 max/1515+525846.
[0720 16:00:00.646376] 0x42258940 (Debug) VOP SUMMARY [s4.meta] Open cnt/321 avg/20+35 min/11+15 max/297+343.

What I would like to do in this script is to have two parameters (the log file name and the summary type) as arguments. With these arguments I want it to retrieve the data (summary type, name of the server, action type, value of counter) from the whole file.

For example, if I call it like this : ./parser.sh log1 VOP

I would like to get for example :
VOP SUMMARY s3 GetResyncAttr 1
VOP SUMMARY s3 Open 6
VOP SUMMARY s4 Setattr 7838
VOP SUMMARY s4 Open 321

I hope this is clear enough and that someone will be able to help me as I haven't been able to succed to do that.

Thanks.

Provide the output according to input you posted , because I have not understood why you have not consider S1 and "close" action type.

Hi pravin,

Ok, let's say I have a log file called log1 with this inside :

[0720 16:00:00.646257] 0x42258940 (Debug) Cache SUMMARY [s1.meta] attrs now/668 min/668 max/668.
[0720 16:00:00.646262] 0x42258940 (Debug) RSVD SUMMARY [s1.meta] reserved space max requested/128 MB accounted now/0 MB
[0720 16:00:00.646273] 0x42258940 (Debug) VOP SUMMARY [s1.meta] Setattr cnt/14 avg/53+54369 min/13+42 max/334+259525.
[0720 16:00:00.646283] 0x42258940 (Debug) VOP SUMMARY [s1.meta] Open cnt/7 avg/24+187 min/19+175 max/34+231.
[0720 16:00:00.646291] 0x42258940 (Debug) VOP SUMMARY [s1.meta] Close cnt/7 avg/20+37 min/16+29 max/28+54.
[0720 16:00:00.646300] 0x42258940 (Debug) VOP SUMMARY [s1.meta] GetResyncAttr cnt/1 avg/20+25 min/20+25 max/20+25.
[0720 16:00:00.646305] 0x42258940 (Debug) Cache SUMMARY [s3.meta] attrs now/4 min/4 max/4.
[0720 16:00:00.646309] 0x42258940 (Debug) RSVD SUMMARY [s3.meta] reserved space max requested/32 MB accounted now/0 MB
[0720 16:00:00.646320] 0x42258940 (Debug) VOP SUMMARY [s3.meta] Setattr cnt/12 avg/31+63182 min/17+39 max/150+334001.
[0720 16:00:00.646329] 0x42258940 (Debug) VOP SUMMARY [s3.meta] Open cnt/6 avg/22+180 min/18+159 max/31+210.
[0720 16:00:00.646337] 0x42258940 (Debug) VOP SUMMARY [s3.meta] Close cnt/6 avg/19+31 min/17+25 max/29+37.
[0720 16:00:00.646346] 0x42258940 (Debug) VOP SUMMARY [s3.meta] GetResyncAttr cnt/1 avg/19+24 min/19+24 max/19+24.
[0720 16:00:00.646351] 0x42258940 (Debug) Cache SUMMARY [s4.meta] attrs now/387 min/85 max/387.
[0720 16:00:00.646355] 0x42258940 (Debug) RSVD SUMMARY [s4.meta] reserved space max requested/128 MB accounted now/128 MB
[0720 16:00:00.646366] 0x42258940 (Debug) VOP SUMMARY [s4.meta] Setattr cnt/7838 avg/23+41116 min/10+22 max/1515+525846.
[0720 16:00:00.646376] 0x42258940 (Debug) VOP SUMMARY [s4.meta] Open cnt/321 avg/20+35 min/11+15 max/297+343.

Let's say that I call the script with two arguments (the name of the log file and the type of summary I want) : ./parser.sh log1 VOP

The output I should get is that because I have VOP summary type as an argument :

VOP SUMMARY s1 Setattr 14
VOP SUMMARY s1 Open 7
VOP SUMMARY s1 Close 7
VOP SUMMARY s1 GetResyncAttr 1
VOP SUMMARY s3 Setattr 12
VOP SUMMARY s3 Open 6
VOP SUMMARY s3 Close 6
VOP SUMMARY s3 GetResyncAttr 1
VOP SUMMARY s4 Setattr 7838
VOP SUMMARY s4 Open 321

Hi Try this,

parser.pl

#!/usr/bin/perl

$filename=shift;
$sum_type=shift;
chomp($sum_type,$filename);

open(FH,"<","$filename") || die "cannot open file \n";

while (<FH>) {
chomp;
if (/\s$sum_type\sSUMMARY\s\[(\w+)\.\w+\]\s(\w+)\scnt\/(\d+)\s/) {
#print $_,"\n";
print "$sum_type SUMMARY $1 $2 $3 \n";
}
}
close(FH);
perl parser.pl logfilename summary_type

Hey Pravin,

I tried your code but I don't get any output when I launch the script.
Does it work for you ?
Another question is it the same code if I want to use this as a bash script ?

Thanks.

Its running fine at my end.

This is perl script. 1) you have to create a file with name "parser.pl"
2) make it executable ( chmod +x parser.pl)
3) run the script , you have to provide log filename with path. In my case log file is in /tmp directory and my log filename is testfile.log
(e.g ./parser.pl /tmp/testfile.log VOIP )

Thanks for the help pravin. i had an error in the script that is why it did not work. I do not want to abuse but can you explain me a little bit what is done in the script because I don't understand everything, especially all those signs you have put in there.

Thanks :b:.

Please post error as well as script.

There is no error in the script you provided it is me who made the error when I wrote it. I had wrote num_type at the beginning and then sum_type for the print that is why I had no output.
Can you explain me what the script does exactly ?

Thanks.

Hi,

#!/usr/bin/perl

$filename=shift;   ## First command line argument (log file name) will come in $filename
$sum_type=shift;   ## second command line argument (Summary Type)  will come in $sum_type
chomp($sum_type,$filename);  ## The chomp() function will remove (usually) any newline character from the end of a string. 			     
open(FH,"<","$filename") || die "cannot open file \n";  ## open file for read else exit from the script

while (<FH>) {
chomp;
if (/\s$sum_type\sSUMMARY\s\[(\w+)\.\w+\]\s(\w+)\scnt\/(\d+)\s/) { 
## pattern matching with log file record


#\w  Match "word" character (alphanumeric plus "_")
#\W  Match non-word character
#\s  Match whitespace character
#\S  Match non-whitespace character
#\d  Match digit character
#\D  Match non-digit character

#*      Match 0 or more times
#+      Match 1 or more times
#?      Match 1 or 0 times

print "$sum_type SUMMARY $1 $2 $3 \n";

#$1 -  If record match with pattern then whatever in first "()" (i.e. (\w+) ) is $1
#$2 -  If record match with pattern then whatever in second "()" (i.e. (\w+) ) is $2
#$3 -  If record match with pattern then whatever in third "()" (i.e. (\d+) ) is $3

 
}
}
close(FH);
1 Like

A big thanks to you pravin. It is really clear now :b:

Hi Samb95

I can try to explain what pravin has done here.

Command line arguments are stored in the aray @ARGV in perl.

By doing a shift he is removing the elements and putting them in respective variables.

$firstname = shift;
$sum_type = shift;

Might i add its good practice to use "use strict" and "use warnings" modules and declare the variables preceded by a "my" keyword.

He then opens a file in read mode "<" and assigns it a file handler FH.

He loops through the file using FH. chomp removes the new line at the end of the input line read from the file.

Next you encounter the reg ex. looks formidable but simple. ( You should have a look at the reg exp tutorials ).

Since no argument is mentioned the reg ex acts on the $_ which contains the currernt input line of the file.

What he is doing is
1: \s$sum_type\sSUMMARY\s : looking for a space followed by your keyword followed by a space, the word SUMMARY followed by another space.

2: \[(\w+)\.\w+\]\s(\w+)\scnt\/(\d+)\s : escape the brackets to match an open square bracket.
Then look for one or more word characters (includes alphanumeric ) = \w+
Notice he has enclosed them in ( ) meaing it will be assigned to the perl special variable $1.

3: next he has matched a "." Since "." is a special character it has been escaped by a backslash. And the rest follow the similar analogy.

Hope this helps.

---------- Post updated at 10:27 AM ---------- Previous update was at 10:23 AM ----------

Just to be safe,

When ever you read command line arguments, your program should do checking to see if the user has entered the correct number of arguments. If not throw an error and quit the program.

1 Like

Thanks for the advices and explanation abhijithtk, I will add them in the code.