Hey all. I am working on some scripts in bash
to perform a variety of functions; there are a variety of steps involved, and they must happen in a specific sequence; what I need help with is a way to calculate some differences in a timestamp in a logfile.
One of the steps in the scripts I am writing involves issuing a command to an application that executes a 'deployment' process of sorts; the shell interface to this application basically receives the request to start this deployment process, and exits. The deployment process can take a wildly variable amount of time (a few minutes, up to a few hours); there are additional actions that my script needs to perform once that process is complete, but these actions cannot begin until it has.
The application that is performing this deployment process writes to a logfile, and I know the entry in the logfile that indicates that this deployment process has finished; however, this logfile is written to consistently, and I cannot clear it. What I need to do is to identify, within my script that initiates the deployment process, the time that the command is executed, and then search through the application's logfile for the completion condition, and compare the timestamps of those messages for the most recent one to occur after the time noted by my script when the action was begun. I can do most of this already; where I'm getting stuck is in parsing the timestamps into a useful, computable format. I can control the way in which my script sets its initial timestamp, but I cannot control the format in which the logfile marks its timestamps, which are written thusly (this entry is the completion condition that I am looking for):
Jul 10, 2012 7:47:45 PM] Application deployment complete.
The following date command will produce a timestamp formatted in exactly this fashion, but I don't know if that's actually useful or not:
date +%b\ %-d\,\ %Y\ %-l\:%M\:%S\ %p
I can certainly run the timestamp through a set of sed
steps to parse out the individual pieces of information in the logfile, but I'm afraid that what I have thus far is, in addition being obviously cumbersome and probably quite amateurish, potentially unproductive and not really the right way to go about this:
#!/bin/bash
# Sets startDate variable with current timestamp in logfile's format, removes unnecessary characters, and converts to underscore delimited format
export startDate=`date +%b\ %-d\,\ %Y\ %-l\:%M\:%S\ %p | sed 's/ /_/g' | sed 's/,//g' | sed 's/:/_/g'`
# Sets each field in timestamp to individual variables using cut
export startMonth=`echo $startDate | cut -d\_ -f1`
export startDay=`echo $startDate | cut -d\_ -f2`
export startYear=`echo $startDate | cut -d\_ -f3`
export startAMPM=`echo $startDate | cut -d\_ -f7`
export startHour=`echo $startDate | cut -d\_ -f4`
# Converts 12-hour time to 24-hour time
if [ "$startAMPM" == PM];
then export startHour=$(($startHour + 12))
fi
export startMin=`echo $startDate | cut -d\_ -f5`
export startSec=`echo $startDate | cut -d\_ -f6`
If I echo each of these variables individually at the end of the script, what I get when running it is this:
Jul 11 2012 19 53 22
I can do all this same logic on the entry in the application's logfile as well (replacing the date command with for instance, :
cat /path/to/logfile | grep "Application deployment complete"
-- but I guess my big question here is, then what?
How can I actually use that information to look for the right entry?
Also, is there a better way to parse out that information?
Thanks very much everyone, I appreciate the help.