awk script to detect specific string in a log file and count it

Hello, can someone guide me on this?

I don't know what is the best approach, (awk script, shell script)

I am using RedHat Linux version 6.5. There is a third party application deployed on that server. This app by default generates 5 log files and each file is 20MB. These log rollover automatically so at any given time there always be 5 log file like below

appname.log
appname.log.1
appname.log.2
appname.log.3
appname.log.4

When the app runs, it creates the following entries in the log file. for simplicity, i have removed the other information.

Requirement is whenever this string RESTful/BR4/Dispatchers/Drop Down Dispatch Process.process/Log appears in the log file, count it. So at the end of the day, we wanted to know on Oct 20, 2018, how many times that entry appeared in the log file. Basically, everyday, we need to collect statics about this app. The script should run automatically and spit out the stats in a file like this

appname date count

BW.AMI_Services-Dropdown-AMI_Services-Dropdown-ProcessArchive Oct 20, 2018 3

-----log files entries------

2017 Oct 20 13:44:03:359 GMT -0600 BW.AMI_Services-Dropdown-AMI_Services-Dropdown-ProcessArchive User [BW-User] - Job-20540 [Services - RESTful/BR4/Dispatchers/Drop Down Dispatch Process.process/Log]:
 
2017 Oct 20 13:45:02:170 GMT -0600 BW.AMI_Services-Dropdown-AMI_Services-Dropdown-ProcessArchive User [BW-User] - Job-20546 [Services - RESTful/BR4/Dispatchers/Drop Down Dispatch Process.process/Log]:
 
2017 Oct 20 13:45:10:605 GMT -0600 BW.AMI_Services-Dropdown-AMI_Services-Dropdown-ProcessArchive User [BW-User] - Job-20547 [Services - RESTful/BR4/Dispatchers/Drop Down Dispatch Process.process/Log]:

Hello ktisbest,

Welcome to forums, thank you for mentioning the details(though your question creates some doubts still), please use code tags for Input_file/sample/output/code/commands which you are using in your posts. Could you please try following and let me know if this helps you.

EDIT: Apologies forum, I had misread the post and posted wrong answer, so adding a answer which could be taken as a starting point, THANKS to Scrutinizer for letting know

awk -v date=$(date +%d) -v month=$(date +%b) -v year=$(date +%Y) '$0 ~ year && $0 ~ month && $0 ~ date && $0 ~ /Services - RESTful\/BR4\/Dispatchers\/Drop Down Dispatch Process\.process\/Log/{count++;} END{print count}' Input_file

@Ktisbest: Please reply to Scrutinizer answers and do let us know on same.

Thanks,
R. Singh

@Ravinder, I am unsure how this supposed to work?
The OP is looking for a way to get a daily count of log occurrences in a rotating set of log files into a new log file.

  • What you propose has the date fixed in both the search string and the print result. So should he create a new script every day?
  • The closing braces in the FNR==1 section do not align with the if statements, which is confusing.
  • I do not get why you need to close files when there are 5 max.
  • Also, he appears to not be looking in which file exactly the data occurred, but how many times, so I do not see why a total needs to be printed per file, so why not only create an END section?
  • Also the END section is missing, if there are contents are in the last file, then their total will not get printed.
  • Why do you also include the filename in the result?

--

@OP:

  • a solution like that will only work if the rotation frequency is less then 5 per day, so maybe a solution should test for this?
  • You want the result for 2018, I assume that is a typo?
  • The date format change is that really necessary, or can it be any date format?
  • The script would need to only record or discover the differences with the last time it was run to avoid double or missing entries in the statistics log..
1 Like

A simple question: is there a timestamp on each line of the logfile or in the lines of interest. That would eliminate the issues scrutinizer mentioned I think.
e.g.,

10/28/2017 14:52:13

First of all thank you for looking into this. Apologies if I was not clear in my post and I will try to follow the forum rules.

Here is the simplified version of the requirements

  1. In real time, a script should monitor appname.log file (no need to monitor .log.1 or .log.2 or .log.3 or .log.4)
  2. On a new line in log file which matches this RESTful/BR4/Dispatchers/Drop Down Dispatch Process.process/Log add the appname date (any format) and count in a new a file. (appnamestats.txt)
  3. For next entry, for same day, we already have appname and date in appnamestats.txt, we just need to increment the count. The same appnamestats.txt file can be used for next day and so.

There is a typo about the date, it should be for current year 2017 and not 2018

The date can be in any format. For example 2017 Oct 20

There always be complete timestamp like below in the lines of interest
2017 Oct 20 13:44:03:359 GMT -0600

btw, i run the provided solution and it only return 16. There are 288 entries for that string in the log file and it should return 288

Why and how, then, comes BW.AMI_Services-Dropdown-AMI_Services-Dropdown-ProcessArchive into play?