Script (ksh) to get data in every 30 mins interval for the given date

Hello,

Since I m new to shell, I had a hard time to sought out this problem.

I have a log file of a utility which tells that batch files are successful with timestamp. Given below is a part of the log file.

2013/03/07 00:13:50 [notice] Apache/1.3.29 (Unix) configured -- resuming normal operations
2013/03/07 00:23:50 [info] Server built: Feb 27 2004 13:56:37
2013/03/07 00:39:40 [notice] Accept mutex: sysvsem (Default: sysvsem)
2013/03/07 01:05:49 [info] [client 64.242.88.10] Batch 126 was successful
2013/03/07 01:15:19 [info] [client 64.242.88.10] Batch 272 was successful
2013/03/07 01:55:29 [info] [client 64.242.88.10] Batch 353 was successful
2013/03/07 02:05:09 [info] [client 64.242.88.10] Batch 241 was successful
2013/03/07 02:35:49 statistics: Use of uninitialized value in concatenation (.) or string at /home/httpd/twiki/lib/TWiki.pm line 528.
2013/03/07 04:45:41 statistics: Can't create file /home/httpd/twiki/data/Main/WebStatistics.txt - Permission denied
2013/03/07 05:05:49 [info] [client 64.242.88.10] Batch 671 was successful
2013/03/07 05:46:41 [info] [client 64.242.88.10] Batch 251 was successful
2013/03/07 06:35:26 [info] [client 64.242.88.10] Batch 181 was successful
2013/03/07 08:05:49 [info] [client 64.242.88.10] Batch 389 was successful
2013/03/07 10:05:29 [info] [client 64.242.88.10] Batch 911 was successful
2013/03/07 10:13:42 [info] [client 64.242.88.10] Batch 681 was successful
2013/03/07 10:45:33 [info] [client 64.242.88.10] Batch 451 was successful
2013/03/07 10:49:51 [info] [client 64.242.88.10] Batch 675 was successful
2013/03/07 11:05:29 [info] [client 64.242.88.10] Batch 439 was successful
2013/03/07 12:55:19 [info] [client 64.242.88.10] Batch 678 was successful
2013/03/07 13:05:33 [info] [client 64.242.88.10] Batch 557 was successful
2013/03/07 13:47:12 [info] [client 64.242.88.10] Batch 881 was successful
2013/03/07 14:09:16 [info] [client 64.242.88.10] Batch 115 was successful
2013/03/07 14:15:31 [info] [client 64.242.88.10] Batch 612 was successful
2013/03/07 14:29:19 [info] [client 64.242.88.10] Batch 111 was successful
2013/03/07 14:35:50 [info] [client 64.242.88.10] Batch 971 was successful
2013/03/07 14:57:49 [info] [client 64.242.88.10] Batch 347 was successful
2013/03/07 15:19:55 [info] [client 64.242.88.10] Batch 824 was successful
2013/03/07 15:28:51 [info] [client 64.242.88.10] Batch 908 was successful
2013/03/07 15:31:44 [info] [client 64.242.88.10] Batch 113 was successful
2013/03/07 15:47:41 [info] [client 64.242.88.10] Batch 990 was successful
2013/03/07 15:57:41 [info] [client 64.242.88.10] Batch 290 was successful
2013/03/07 16:05:49 [info] [client 64.242.88.10] Batch 120 was successful
2013/03/07 16:22:18 [error] [client 24.70.56.49] File does not exist: /home/httpd/twiki/view/Main/WebHome
2013/03/07 16:25:39 [info] [client 64.242.88.10] Batch 150 was successful
2013/03/07 16:29:49 [info] [client 64.242.88.10] Batch 145 was successful
2013/03/07 16:47:41 [info] [client 64.242.88.10] Batch 131 was successful
2013/03/07 16:51:19 [info] [client 64.242.88.10] Batch 481 was successful
2013/03/07 17:01:42 [info] [client 64.242.88.10] Batch 676 was successful
2013/03/07 17:23:14 [info] [client 64.242.88.10] Batch 121 was successful
2013/03/07 17:37:12 [info] [client 64.242.88.10] Batch 439 was successful
2013/03/07 17:43:39 [info] [client 64.242.88.10] Batch 336 was successful
2013/03/07 18:21:42 [info] [client 64.242.88.10] Batch 772 was successful
2013/03/07 18:25:11 [info] [client 64.242.88.10] Batch 154 was successful
2013/03/07 18:26:26 [info] [client 64.242.88.10] Batch 189 was successful
2013/03/07 19:01:09 [info] [client 64.242.88.10] Batch 346 was successful
2013/03/07 19:11:28 [info] [client 64.242.88.10] Batch 678 was successful
2013/03/07 19:19:29 [info] [client 64.242.88.10] Batch 814 was successful
2013/03/07 19:31:41 [info] [client 64.242.88.10] Batch 114 was successful
2013/03/07 19:33:16 [info] [client 64.242.88.10] Batch 561 was successful
2013/03/07 19:41:19 [info] [client 64.242.88.10] Batch 881 was successful
2013/03/07 19:53:17 [info] [client 64.242.88.10] Batch 445 was successful
2013/03/07 19:57:49 [info] [client 64.242.88.10] Batch 321 was successful
2013/03/07 20:01:56 [info] [client 64.242.88.10] Batch 890 was successful
2013/03/07 21:11:17 [info] [client 64.242.88.10] Batch 665 was successful
2013/03/07 21:19:55 [info] [client 64.242.88.10] Batch 340 was successful
2013/03/07 21:29:29 [info] [client 64.242.88.10] Batch 149 was successful
2013/03/07 21:31:49 [info] [client 64.242.88.10] Batch 213 was successful
2013/03/07 21:43:19 [info] [client 64.242.88.10] Batch 522 was successful
2013/03/07 21:49:46 [info] [client 64.242.88.10] Batch 450 was successful
2013/03/07 22:37:31 [info] [client 64.242.88.10] Batch 661 was successful
2013/03/07 22:39:11 [info] [client 64.242.88.10] Batch 542 was successful
2013/03/07 23:05:49 [info] [client 64.242.88.10] Batch 598 was successful
2013/03/07 23:41:14 [info] [client 64.242.88.10] Batch 811 was successful
2013/03/07 24:12:42 [info] [client 64.242.88.10] Batch 429 was successful
2013/03/07 24:22:09 [info] [client 64.242.88.10] Batch 238 was successful
2013/03/07 24:29:01 [info] [client 64.242.88.10] Batch 987 was successful
2013/03/07 24:44:43 [info] [client 64.242.88.10] Batch 144 was successful
.
.
.
2013/03/08 ------------------------------
2013/03/09 ------------------------------
2013/03/10 ------------------------------

Kindly help me with a ksh to get data from this log file in 30 minutes interval i.e for the given date in every 30 mins interval I need how many batches
were successful.

Desired output:

Enter the date:
2013/03/07

Batches that were successful on 2013/03/07 between 00:00:00 and 00:29:59 : 4
Batches that were successful on 2013/03/07 between 00:30:00 and 00:59:59 : 2
.
.
.
.
.
Batches that were successful on 2013/03/07 between 23:29:59 and 23:59:59 :4
#!/bin/ksh
RECIPIENTS="your email"
count=`grep "was successful*" filename| wc -l`
Date=`cat filename | awk '{print $1}' | head -1`
start_time=`sed -n '1 p' filename | awk '{print $2}'`
End_time=`sed -n '$ p' filename | awk '{print $2}'`
echo "Batches that were successful on $Date between $start_time and $End_time : $count" > file.txt
mailx -r Batch Report -s "Current Status of successful batchs" $RECIPIENTS >> file.txt
sleep 1800

[CODE] Run the script in background like nohup ksh scriptname & it will run ever 30 minutes and send you email.

Could you please check timestamp in your file.

2013/03/07 24:12:42 [info] [client 64.242.88.10] Batch 429 was successful
2013/03/07 24:22:09 [info] [client 64.242.88.10] Batch 238 was successful
2013/03/07 24:29:01 [info] [client 64.242.88.10] Batch 987 was successful
2013/03/07 24:44:43 [info] [client 64.242.88.10] Batch 144 was successful

I think hours 24 is invalid as your start hour is 0

1 Like

I agree with pravin27 that your input data has out of range timestamps. It is also weird that you want the final entry in your output to have timestamps 23:29:59 and 23:59:59 rather than 23:30:00 and 23:59:59. But, following you general pattern and ignoring the out of range data, the following script seems to do what you want:

#!/bin/ksh
printf "Enter the date (YYYY/MM/DD): "
read dd
awk -F '[ :]' -v dd="$dd" '
BEGIN { fmt = "Batches that were successful on %s between %02d:%02d:00 " \
                "and %02d:%02d:59 : %d\n"
}
$1 == dd && $NF == "successful" { s[$2 + 0, $3 > 29]++ }
END {   for(h = 0; h < 24; h++)
                 for(m = 0; m < 2; m++)
                        printf(fmt, dd, h, m * 30, h, m * 30 + 29, s[h, m])
}' log

producing the output:

Enter the date (YYYY/MM/DD): 2013/03/07
Batches that were successful on 2013/03/07 between 00:00:00 and 00:29:59 : 0
Batches that were successful on 2013/03/07 between 00:30:00 and 00:59:59 : 0
Batches that were successful on 2013/03/07 between 01:00:00 and 01:29:59 : 2
Batches that were successful on 2013/03/07 between 01:30:00 and 01:59:59 : 1
Batches that were successful on 2013/03/07 between 02:00:00 and 02:29:59 : 1
Batches that were successful on 2013/03/07 between 02:30:00 and 02:59:59 : 0
Batches that were successful on 2013/03/07 between 03:00:00 and 03:29:59 : 0
Batches that were successful on 2013/03/07 between 03:30:00 and 03:59:59 : 0
Batches that were successful on 2013/03/07 between 04:00:00 and 04:29:59 : 0
Batches that were successful on 2013/03/07 between 04:30:00 and 04:59:59 : 0
Batches that were successful on 2013/03/07 between 05:00:00 and 05:29:59 : 1
Batches that were successful on 2013/03/07 between 05:30:00 and 05:59:59 : 1
Batches that were successful on 2013/03/07 between 06:00:00 and 06:29:59 : 0
Batches that were successful on 2013/03/07 between 06:30:00 and 06:59:59 : 1
Batches that were successful on 2013/03/07 between 07:00:00 and 07:29:59 : 0
Batches that were successful on 2013/03/07 between 07:30:00 and 07:59:59 : 0
Batches that were successful on 2013/03/07 between 08:00:00 and 08:29:59 : 1
Batches that were successful on 2013/03/07 between 08:30:00 and 08:59:59 : 0
Batches that were successful on 2013/03/07 between 09:00:00 and 09:29:59 : 0
Batches that were successful on 2013/03/07 between 09:30:00 and 09:59:59 : 0
Batches that were successful on 2013/03/07 between 10:00:00 and 10:29:59 : 2
Batches that were successful on 2013/03/07 between 10:30:00 and 10:59:59 : 2
Batches that were successful on 2013/03/07 between 11:00:00 and 11:29:59 : 1
Batches that were successful on 2013/03/07 between 11:30:00 and 11:59:59 : 0
Batches that were successful on 2013/03/07 between 12:00:00 and 12:29:59 : 0
Batches that were successful on 2013/03/07 between 12:30:00 and 12:59:59 : 1
Batches that were successful on 2013/03/07 between 13:00:00 and 13:29:59 : 1
Batches that were successful on 2013/03/07 between 13:30:00 and 13:59:59 : 1
Batches that were successful on 2013/03/07 between 14:00:00 and 14:29:59 : 3
Batches that were successful on 2013/03/07 between 14:30:00 and 14:59:59 : 2
Batches that were successful on 2013/03/07 between 15:00:00 and 15:29:59 : 2
Batches that were successful on 2013/03/07 between 15:30:00 and 15:59:59 : 3
Batches that were successful on 2013/03/07 between 16:00:00 and 16:29:59 : 3
Batches that were successful on 2013/03/07 between 16:30:00 and 16:59:59 : 2
Batches that were successful on 2013/03/07 between 17:00:00 and 17:29:59 : 2
Batches that were successful on 2013/03/07 between 17:30:00 and 17:59:59 : 2
Batches that were successful on 2013/03/07 between 18:00:00 and 18:29:59 : 3
Batches that were successful on 2013/03/07 between 18:30:00 and 18:59:59 : 0
Batches that were successful on 2013/03/07 between 19:00:00 and 19:29:59 : 3
Batches that were successful on 2013/03/07 between 19:30:00 and 19:59:59 : 5
Batches that were successful on 2013/03/07 between 20:00:00 and 20:29:59 : 1
Batches that were successful on 2013/03/07 between 20:30:00 and 20:59:59 : 0
Batches that were successful on 2013/03/07 between 21:00:00 and 21:29:59 : 3
Batches that were successful on 2013/03/07 between 21:30:00 and 21:59:59 : 3
Batches that were successful on 2013/03/07 between 22:00:00 and 22:29:59 : 0
Batches that were successful on 2013/03/07 between 22:30:00 and 22:59:59 : 2
Batches that were successful on 2013/03/07 between 23:00:00 and 23:29:59 : 1
Batches that were successful on 2013/03/07 between 23:30:00 and 23:59:59 : 1

when the user types in the data in red in response to the prompt for the date and the file named log contains your sample input data.

1 Like

Thanks a lot for your solution....:slight_smile: Regrets for the typo...:frowning:

can you please explain the code such that it will be helpful for me to build on this... for example if I want to match some different patterns or strings.

#!/bin/ksh
printf "Enter the date (YYYY/MM/DD): "
read dd
awk -F '[ :]' -v dd="$dd" '
BEGIN { fmt = "Batches that were successful on %s between %02d:%02d:00 " \
                "and %02d:%02d:59 : %d\n"
}
$1 == dd && $NF == "successful" { s[$2 + 0, $3 > 29]++ }
END {   for(h = 0; h < 24; h++)
                 for(m = 0; m < 2; m++)
                        printf(fmt, dd, h, m * 30, h, m * 30 + 29, s[h, m])
}' log

1st line: Use the Korn shell to interpret this script
2nd line: Print a prompt asking the user to input a date.
3rd line: Read the date the user enters and save it in the shell variable named dd.
4th line: Invoke the awk utility telling it to use space and colon characters as field delimiters and define the awk variable dd to have the same value as the shell variable dd.
5th, 6th, and 7th lines: Before any input is read by the awk script, define the awk variable fmt to be the format string we will use to print results in the end.
8th line: If the 1st field on each input file line is the same string as the dd awk variable and the last field on the line is the string "successful", then increment the element of the array s[] indexed by two subscripts (the 2nd field in the line [the hour] with the leading zero removed if there is one, and 0 if the 3rd field in the line [the minute] is less than or equal to 29 or 1 if the minute is 30 or larger).
9th, 10th, and 11th line: After all input lines have been read, loop through every half hour with h set to the hour of the day and m set to the half hour within the hour and print the user entered date (dd), the hour as a two digit string with leading 0 fill, the starting minute as a two digit string with leading 0 fill, the hour again, the ending minute, and the numberer of times the patterns matched on the 8th line were found on lines in the input file within that half hour period.
12 line: Terminate the awk script and specify that the input to be processed is contained in a file named log.

1 Like

Great thanks for your explanation.:slight_smile:

I was trying out this piece of code below

awk -v rd="2013/03/07" -v st="00:00:00" -v et="00:29:59" '$1==rd && $2>=st && $2<=et' log | grep -c "successful"

Because i wanted make that script very generic such that in which ever place the pattern "successful" is found it should get the count. Since I want to use the same script for two more diffrent log in which the successful message can be in any place not neccesarily in the last ($NF == "successful").

But the problem i was facing here was i was not able to increment the timestamp values i.e "st" and "et" with 30 minutes and get a logic for that to repeat the piece of code for every 30 minutes.

So can you please suggest me a solution for that.

Assuming you still want the same output format, I would just change the:

$1 == dd && $NF == "successful" { s[$2 + 0, $3 > 29]++ }

in the script I gave you to:

$1 == dd && /successful/ { s[$2 + 0, $3 > 29]++ }

Adding the boilerplate around the piece of code you suggested to produce the same output (calling awk 48 times calling grep 48 times and reading your log file 49 times) instead of calling awk once to read your log file once seems extremely wasteful.

If you really want to process your log file 48 times instead of once, I can come up with a way to do that, but I'll need some explanation as to why doing that would be better than the trivial change to my script that is shown above.

1 Like

Thanks again...:slight_smile:

I do not have any specific reason to go with that solution (awk -v rd="2013/03/07" -v st="00:00:00" -v et="00:29:59" '$1==rd && $2>=st && $2<=et' log | grep -c "successful"), I just wanted to let you know that i was trying out in that method. Thanks for the feedback on the same as well.

Its good for me to go with your solution.:slight_smile:

But I need one more functionality to be added with this i.e to get the sum of all the successful batches at the last as shown in the below sample output

 
Enter Date: 2013/03/07
 
Batches that were successful  2013/03/07 between 00:00:00 and 00:29:59: 0
Batches that were successful  2013/03/07 between 00:30:00 and 00:59:59: 0
Batches that were successful  2013/03/07 between 01:00:00 and 01:29:59: 0
Batches that were successful  2013/03/07 between 01:30:00 and 01:59:59: 0
Batches that were successful  2013/03/07 between 02:00:00 and 02:29:59: 0
Batches that were successful  2013/03/07 between 02:30:00 and 02:59:59: 0
Batches that were successful  2013/03/07 between 03:00:00 and 03:29:59: 0
Batches that were successful  2013/03/07 between 03:30:00 and 03:59:59: 0
Batches that were successful  2013/03/07 between 04:00:00 and 04:29:59: 0
Batches that were successful  2013/03/07 between 04:30:00 and 04:59:59: 0
Batches that were successful  2013/03/07 between 05:00:00 and 05:29:59: 0
Batches that were successful  2013/03/07 between 05:30:00 and 05:59:59: 0
Batches that were successful  2013/03/07 between 06:00:00 and 06:29:59: 0
Batches that were successful  2013/03/07 between 06:30:00 and 06:59:59: 0
Batches that were successful  2013/03/07 between 07:00:00 and 07:29:59: 0
Batches that were successful  2013/03/07 between 07:30:00 and 07:59:59: 0
Batches that were successful  2013/03/07 between 08:00:00 and 08:29:59: 0
Batches that were successful  2013/03/07 between 08:30:00 and 08:59:59: 0
Batches that were successful  2013/03/07 between 09:00:00 and 09:29:59: 0
Batches that were successful  2013/03/07 between 09:30:00 and 09:59:59: 0
Batches that were successful  2013/03/07 between 10:00:00 and 10:29:59: 0
Batches that were successful  2013/03/07 between 10:30:00 and 10:59:59: 0
Batches that were successful  2013/03/07 between 11:00:00 and 11:29:59: 0
Batches that were successful  2013/03/07 between 11:30:00 and 11:59:59: 0
Batches that were successful  2013/03/07 between 12:00:00 and 12:29:59: 0
Batches that were successful  2013/03/07 between 12:30:00 and 12:59:59: 0
Batches that were successful  2013/03/07 between 13:00:00 and 13:29:59: 0
Batches that were successful  2013/03/07 between 13:30:00 and 13:59:59: 0
Batches that were successful  2013/03/07 between 14:00:00 and 14:29:59: 0
Batches that were successful  2013/03/07 between 14:30:00 and 14:59:59: 0
Batches that were successful  2013/03/07 between 15:00:00 and 15:29:59: 0
Batches that were successful  2013/03/07 between 15:30:00 and 15:59:59: 0
Batches that were successful  2013/03/07 between 16:00:00 and 16:29:59: 0
Batches that were successful  2013/03/07 between 16:30:00 and 16:59:59: 0
Batches that were successful  2013/03/07 between 17:00:00 and 17:29:59: 0
Batches that were successful  2013/03/07 between 17:30:00 and 17:59:59: 0
Batches that were successful  2013/03/07 between 18:00:00 and 18:29:59: 0
Batches that were successful  2013/03/07 between 18:30:00 and 18:59:59: 0
Batches that were successful  2013/03/07 between 19:00:00 and 19:29:59: 0
Batches that were successful  2013/03/07 between 19:30:00 and 19:59:59: 0
Batches that were successful  2013/03/07 between 20:00:00 and 20:29:59: 0
Batches that were successful  2013/03/07 between 20:30:00 and 20:59:59: 0
Batches that were successful  2013/03/07 between 21:00:00 and 21:29:59: 0
Batches that were successful  2013/03/07 between 21:30:00 and 21:59:59: 37
Batches that were successful  2013/03/07 between 22:00:00 and 22:29:59: 55
Batches that were successful  2013/03/07 between 22:30:00 and 22:59:59: 118
Batches that were successful  2013/03/07 between 23:00:00 and 23:29:59: 65
Batches that were successful  2013/03/07 between 23:30:00 and 23:59:59: 49
----------------------------------------------------------------
Total Batches that were successful  on 2013/03/07: 324
----------------------------------------------------------------

With the scripts I have given you and the explanation of what one of the scripts does, can you tell us where changes need to be made to the script to add the additional three lines of output at the end?

How close can you come to getting the script to produce the output you want?

1 Like

This is how I could take it forward

 
#!/bin/ksh
printf "Enter the date (YYYY/MM/DD): "
read dd
awk -F '[ :]' -v dd="$dd" '
BEGIN { fmt = "Batches that were successful on %s between %02d:%02d:00 " \ 
"and %02d:%02d:59 : %d\n"
}
$1 == dd && /successful/ { s[$2 + 0, $3 > 29]++ }
END {   for(h = 0; h < 24; h++)
for(m = 0; m < 2; m++)
printf(fmt, dd, h, m * 30, h, m * 30 + 29, s[h, m])
}' log >> report.txt
 
awk -F ":" -v dd="$dd" '{x += $NF} END {print "Total Batched That Were Successful on $dd :" x}' report.txt

Please get into the habit of using indentation to show the structure of your code. It makes it a lot easier for humans to see what you're trying to do even if the awk interpreter (and the Korn shell) don't much care about spacing in the scripts they process (as long as reserved words are separated as required by their lexical analyzers).

I guess I confused you by separating the format string used by printf() to format its output from the printf() statement itself. Please look at your system's man page for the awk utility and look closely at its description of the awk printf() function.

As had been mentioned before, $dd in awk is not a reference to an awk variable named dd, it is a reference to the contents of the field specified by the numeric value of the contents of the awk variable dd. References to variables in the awk programming language and references to variables in the shell programming language are different. So, $dd (when dd contains 2013/03/07) is not valid; but, even if it was, $dd inside quotes in the format string will print the literal string $dd not the contents of the dd'th field in the current input line (which has no defined meaning in the END clause).

Try the following:

#!/bin/ksh
printf "Enter the date (YYYY/MM/DD): "
read dd
awk -F '[ :]' -v dd="$dd" '
$1 == dd && /successful/ {
        s[$2 + 0, $3 > 29]++
        t++
}
END {   for(h = 0; h < 24; h++)
                for(m = 0; m <= 1; m++)
                        printf("Batches that were successful on %s between %02d:%02d:00 and %02d:%02d:59 : %d\n",
                                dd, h, m * 30, h, m * 30 + 29, s[h, m])
        dl =  "----------------------------------------------------------------"
        printf("%s\nTotal Batches that were sucessful on %s: %d\n%s\n",
                dl, dd, t, dl)
}' log

I would normally split that format string into two parts (so my script would be easily readable on an 80 column output device:

                        printf("Batches that were successful on %s between " \
                                "%02d:%02d:00 and %02d:%02d:59 : %d\n",
                                dd, h, m * 30, h, m * 30 + 29, s[h, m])

but I didn't want to confuse you by splitting the format string into two parts.

1 Like

Great Thanks for your time, comments and solution...!!! :slight_smile: