Suggestions for Inserting Missing Timestamps

I have apache logs that I'm trying to create some metrics with. I've stripped the files down to number of request per second. Example:

03:04:01 3 (hour:minute:second #of requests)
03:04:03 2
03:04:04 1
03:04:05 1
03:04:07 2

My problem is that not every second has a request so of course no log entry for that given second. I need to insert each missing second and a zero for the request.

I'm really not sure how to go about doing that. I guess create a loop that would create a time stamp for each second of the 24 hour period (84600 seconds) and then grep the file for that second. If the second does not exist then cat the line to the file, and eventually sort the file. Seems like that would take up a pretty big chunk of time.

Creating a loop to generate each time stamp is beyond my scripting skills and anything I've tried has come up way short.

Anyone have a solution using BASH, SH, PERL...

Thanks for any suggestions.

Here's a Perl solution that generates such data for today. A slight variation though. To grep a file for every second in the day would not be too efficient.

Instead, this program goes through the file first and populates a hash that has the timestamp as the key and number of requests as its value.

Thereafter, it generates the loop for all seconds for today and either prints the number of requests, if it exists in the hash, or 0 if not.

$ 
$ # display the contents of the log file
$ # I've added boundary values to ensure the loop works as expected
$ cat apache.log
00:00:00 997
00:00:01 998
00:00:02 999
00:00:03 1000
03:04:01 3
03:04:03 2
03:04:04 1
03:04:05 1
03:04:07 2
23:59:56 997
23:59:57 998
23:59:58 999
23:59:59 9999
$ 
$ # show the Perl program
$ cat -n apache.pl
     1  #!/usr/bin/perl -w
     2  use Date::Calc qw(Today Add_Delta_DHMS);
     3
     4  # first, go through the log file and fill up the hash %requests
     5  $logfile="apache.log";
     6  open (LOG, $logfile) or die "Can't open $logfile: $!";
     7  while (<LOG>) {
     8    chomp(@x = split);
     9    $requests{$x[0]} = $x[1];
    10  }
    11  close (LOG) or die "Can't close $logfile: $!";
    12
    13  # now generate the loop for today, and print the 
    14  # key value if present in the hash, or 0 otherwise
    15  ($year,$month,$day) = Today;
    16  $hour = $min = $sec = 0;
    17  foreach (1..24*60*60) {
    18    $hms = sprintf("%02d:%02d:%02d",$hour,$min,$sec);
    19    $reqnum = defined ($requests{$hms}) ? $requests{$hms} : 0;
    20    printf("%02d/%02d/%04d %02d:%02d:%02d %s\n",$month,$day,$year,$hour,$min,$sec,$reqnum);
    21    ($year,$month,$day, $hour,$min,$sec) = Add_Delta_DHMS($year,$month,$day,$hour,$min,$sec,0,0,0,1);
    22  }
    23
$
$ # now run the program
$ # I've snipped most of the output to save storage space, and
$ # to keep forum members from thinking I've lost my mind. ;)
$ # You may want to redirect the output to a file.
$
$ perl apache.pl
02/14/2010 00:00:00 997
02/14/2010 00:00:01 998
02/14/2010 00:00:02 999
02/14/2010 00:00:03 1000
02/14/2010 00:00:04 0
02/14/2010 00:00:05 0
02/14/2010 00:00:06 0
...
...
02/14/2010 03:04:00 0
02/14/2010 03:04:01 3
02/14/2010 03:04:02 0
02/14/2010 03:04:03 2
02/14/2010 03:04:04 1
02/14/2010 03:04:05 1
02/14/2010 03:04:06 0
02/14/2010 03:04:07 2
02/14/2010 03:04:08 0
...
...
02/14/2010 23:59:54 0
02/14/2010 23:59:55 0
02/14/2010 23:59:56 997
02/14/2010 23:59:57 998
02/14/2010 23:59:58 999
02/14/2010 23:59:59 9999
$ 
$

HTH,
tyler_durden