Hello;
I'm having about 800 log files and i'm trying to write a script that report the counts of lines per second or "requests per second" in each log file and report the output which includes the timestamp for the highest lines per second count and the log file name and the highest number per second from that file and then go to the second log file and do the same thing then append the results to the same output file
I'm currently using this command to get the results from each log but doing this for 800 log files is not practical
grep "2017-16-02" logfile1 | cut -c2-18 | uniq -c
This command reports a list of lines count per 10 seconds but it's not very efficient
So the output file should look like this:
Date/time "1st LogFileName" "highest requests per second for this log"
Date/time "2nd LogFileName" "highest requests per second for this log"
You have given us no indication of where the Date/time is in an input line in your log files. You have given us no indication of where the "requests per second for this log" appears in an input line in your log files, nor what a line in your input log files represents. Your command line seems to only be looking for entries that occur on a certain date, but your description says nothing about looking for a specific date. You talk about maximum requests/second, but nothing in your pipeline seems to be making any attempt to find a maximum value in any of the individual input files nor in the combined aggregation of input files.
I could make lots of guesses about what you might be trying to do and what your data format(s) is(are), but it would be MUCH better if you would clearly describe your input file format(s), show us a couple of sample input files, and show us the exact outputs that should be produced from those sample inputs.
Sorry for not posting some of the log files.. Here it's below.. What i'm looking for is to get the highest requests per seconds "mostly will be lines per second" for each log file and append the output to a file.
The output needs to include the log file name and the highest request per second in that log file
I am very disappointed in your response. You said:
which makes it sound like the results you are getting from your script give you what you want, but is too slow to use to process 800 log files. But, with your sample input, it produces no output. If we convert the date in the grep in your code:
grep "2017-16-02" logfile1 | cut -c2-18 | uniq -c
which is in the format YYYY-DD-MM to the date found in your sample file:
2017-02-15 17:49:06 ... ... ...
which is in the format YYYY-MM-DD , we get the output:
5 017-02-15 17:49:0
which in addition to truncating the year also truncates the time so it will give you counts of ten second intervals instead of one second intervals, and makes no attempt to find the maximum count and makes no attempt to include the filename in the output.
Are you looking for counts for each one second interval or for 10 second intervals?
Are you looking for counts only on a specific date, or are you looking for the maximum count in each file no matter what the date might be for that count?
You have not answered most of the questions I asked in post #2 in this thread:
You have given us no indication of where the "requests per second for this log" appears in an input line in your log files, nor what a line in your input log files represents.
Your command line seems to only be looking for entries that occur on a certain date, but your description says nothing about looking for a specific date.
You talk about maximum requests/second, but nothing in your pipeline seems to be making any attempt to find a maximum value in any of the individual input files nor in the combined aggregation of input files.
Please clearly describe your input file format(s). (What you have shown us seems to be a bunch of random text that is chopped into lines that are a little less than 150 characters per line, but no guaranteed way to detect the start of a record.)
Show us a couple of sample input files.
And, show us the exact outputs that should be produced from those sample inputs.
If you can't provide a clear specification of what it is that needs to be done, it will be very hard for us to help you find a solution to your problem!
Sorry to disappoint you my friend.
if i get the interval every 10 seconds is fine. the issue is i want this to be done through a script and not manually one at a time. My apology if i'm not explaining it better.
Repeatedly saying that you want to perform some unspecified task instead of doing it manually is getting us nowhere. Please clearly answer the questions I asked in post #4, or I will close this thread.