Searching for gaps in huge (2.2G) log file?

I've got a 2.2 Gig syslog file from our Cisco firewall appliance. The problem is that we've been seeing gaps in the syslog for anywhere from 10 minutes to 2 hours. Currently I've just been using 'less' and paging through the file to see if I can find any noticeable gaps. Obviously this isn't the brightest way to do this (Unless I want to finish paging through the file on 12/31/2006!). I'm wondering if there are any utilities that will find gaps with the time stamps as criteria? I've thought of maybe trying to set up some kind of loop in bash that would increment the fields in a timestamp variable and then grepping for each. Anything that doesn't show up is noted and then I can look in the file for that time reference or just before it. But there HAS to be a better way. Any thoughts?

I have an idea, but I'm not much for scripting so I'll let you do that part. :slight_smile:

You could use awk to pick just the field with the timestamp in it. Then for each line subtract the timestamp from the next one. If the difference is greater than some number of minutes you choose output both lines to another file. That way you could use the (hopefully much smaller) secondary file to pinpoint exactly when the gaps occur.

Here's what I wound up doing:

Run a grep for the HH:MM:AM/PM string for each specific hour in 24 hours and pipe it to a file for just that hour. Then I have a better shot of trying to scroll through the logs to find any missing timestamps. Interesting to see our network usage starts going up at 9:00AM when our doors open, peaks in the afternoon when the public computers are all in use (200+ meg logs per hour between 12:00PM and 5:00PM) and then slowly drops off until we close at 9:00PM.

Why not loop grepping for each minute? Pipe into "wc-l". If it's too low, print it out. That should be very easy. Don't be too surprised if it thins out during your busiest period though. syslog is a UDP service and no retransmission attempt is made.