How to compare the time in different format from a file?

manas_ranjan · December 21, 2012, 6:51am

15:09:50.350038  Reading A Data
15:09:50.371645  Reading B Data
15:10:55.655724  Initializing models
15:11:31.320920  Preparing Simulation
15:11:32.763217  Running Calculation
15:15:29.668882  Aggregating Results
15:15:29.950897  Persisting Results

How could I make a matrix in H.M.S.MS format as below which is always time differnce between start time of 2nd activity and start time of it's own activity.

Reading A Data 	(15:09:50.371645 - 15:09:50.350038)
Reading B Data	(15:10:55.655724 - 15:09:50.371645)
Initializing models 	(15:11:31.320920 - 15:10:55.655724)
Preparing Simulation	(15:11:32.763217 - 15:11:31.320920)
Running Calculation	(15:15:29.668882 - 15:11:32.763217)
Aggregating Results	(15:15:29.950897 - 15:15:29.668882)
Persisting Results	(15:15:29.950897)

SO here Reading A data took time A.XMs which is difference from start time of 2nd activity which is 15:09:50.371645 minus it's own start time which is 15:09:50.350038.
Similarly Reading B Data took time B.Yms which is difference of start time of next activity which is 15:10:55.655724(for Initializing models ) minus it's own start time which is 15:09:50.371645.
the process goes on till the last line; as Last activity won't have any time to compare.

pamu · December 21, 2012, 7:28am

If you want difference in milliseconds..

try sth like this..

awk -F "  +" 'a{print $1,x,y;x=$1;y=$2} !a{x=$1;y=$2;a++}END{print $1,x,y}' OFS="\t" file | while read a b c
do
echo -e "$c\t$(expr $(date -d "$a" +%s%N) - $(date -d "$b" +%s%N))"
done

Reading A Data  21607000
Reading B Data  65284079000
Initializing models     35665196000
Preparing Simulation    1442297000
Running Calculation     236905665000
Aggregating Results     282015000
Persisting Results      0

jim_mcnamara · December 21, 2012, 7:43am

turn the first column into a floating point value - this assumes you WILL NOT have run this so that times overlap midnight.
tmpfile:

54590.350038 15:09:50.350038  Reading A Data
54590.371645 15:09:50.371645  Reading B Data
54655.655724 15:10:55.655724  Initializing models
54691.320920 15:11:31.320920  Preparing Simulation
54692.763217 15:11:32.763217  Running Calculation
54929.668882 15:15:29.668882  Aggregating Results
54929.950897 15:15:29.950897  Persisting Results

awk -F '[: ]'  '{dbl=($1*3600) + ($2*60) + $3
                 printf("%.6f ",dbl)
                 print $0}' infile>tmpfile

Does that give you enough to go on?

manas_ranjan · December 21, 2012, 7:47am

Hi Pamu,

you are a real life savor, thanks but any way it's throwing out error for last line in this case for Persisting Results

date: invalid date `Persisting'
expr: syntax error
Results
        0

so is there a way to make it h:m:s.ms format the o/p?

Don_Cragun · December 21, 2012, 9:31pm

I tried using floating point (as Jim suggested), but with some additional testing I found the results were occasionally off by a microsecond. The following seems to work even when time stamps roll over to the next day. (It still assumes that there is always less than 24 hours between adjacent time stamps.) If you're running on a Solaris system, use nawk or /usr/xpg4/bin/awk instead of awk:

awk -F '  ' '{  
        if(split($1, nf, /[:.]/) != 4) {
                printf "Time stamp split for \"%s\" on line %d failed\n", $1, NR
                exit 1 
        }
        if(NR > 1) {
                usec = nf[4] - lf[4]
                sec = nf[3] - lf[3]
                min = nf[2] - lf[2]
                hr = nf[1] - lf[1]
                if(usec < 0) {usec += 1000000; sec--}
                if(sec < 0) {sec += 60; min--}
                if(min < 0) {min += 60; hr--}
                if(hr < 0) hr += 24
                printf "%2d:%02d:%02d.%06d\n", hr, min, sec, usec
        }
        for(i = 1; i <= 4; i++) lf = nf
        printf "%20s: ", $2
}
END {   printf "unknown (no end time stamp)\n"
}' in

with the file in containing:

15:09:50.350038  Reading A Data
15:09:50.371645  Reading B Data
15:10:55.655724  Initializing models
15:11:31.320920  Preparing Simulation
15:11:32.763217  Running Calculation
15:15:29.668882  Aggregating Results
15:15:29.950897  Persisting Results
23:59:59.000000  1 minute to midnight
00:00:00.000000  midnight
23:59:59.999999  eod
00:00:00.000001  early
00:00:00.000000  next midnight

the ouput produced is:

      Reading A Data:  0:00:00.021607
      Reading B Data:  0:01:05.284079
 Initializing models:  0:00:35.665196
Preparing Simulation:  0:00:01.442297
 Running Calculation:  0:03:56.905665
 Aggregating Results:  0:00:00.282015
  Persisting Results:  8:44:29.049103
1 minute to midnight:  0:00:01.000000
            midnight: 23:59:59.999999
                 eod:  0:00:00.000002
               early: 23:59:59.999999
       next midnight: unknown (no end time stamp)

manas_ranjan · December 26, 2012, 7:17am

don cragun:

I tried using floating point (as Jim suggested), but with some additional testing I found the results were occasionally off by a microsecond. The following seems to work even when time stamps roll over to the next day. (It still assumes that there is always less than 24 hours between adjacent time stamps.) If you're running on a Solaris system, use nawk or /usr/xpg4/bin/awk instead of awk:

awk -F '  ' '{  
   if(split($1, nf, /[:.]/) != 4) {
   printf "Time stamp split for \"%s\" on line %d failed\n", $1, NR
   exit 1 
   }
   if(NR > 1) {
   usec = nf[4] - lf[4]
   sec = nf[3] - lf[3]
   min = nf[2] - lf[2]
   hr = nf[1] - lf[1]
   if(usec < 0) {usec += 1000000; sec--}
   if(sec < 0) {sec += 60; min--}
   if(min < 0) {min += 60; hr--}
   if(hr < 0) hr += 24
   printf "%2d:%02d:%02d.%06d\n", hr, min, sec, usec
   }
   for(i = 1; i <= 4; i++) lf = nf
   printf "%20s: ", $2
}
END {   printf "(no end time stamp)\n"
}' in

with the file in containing:

15:09:50.350038  Reading A Data
15:09:50.371645  Reading B Data
15:10:55.655724  Initializing models
15:11:31.320920  Preparing Simulation
15:11:32.763217  Running Calculation
15:15:29.668882  Aggregating Results
15:15:29.950897  Persisting Results
23:59:59.000000  1 minute to midnight
00:00:00.000000  midnight
23:59:59.999999  eod
00:00:00.000001  early
00:00:00.000000  next midnight

the ouput produced is:

   Reading A Data:  0:00:00.021607
   Reading B Data:  0:01:05.284079
 Initializing models:  0:00:35.665196
Preparing Simulation:  0:00:01.442297
 Running Calculation:  0:03:56.905665
 Aggregating Results:  0:00:00.282015
  Persisting Results:  8:44:29.049103
1 minute to midnight:  0:00:01.000000
   midnight: 23:59:59.999999
   eod:  0:00:00.000002
   early: 23:59:59.999999
   next midnight: unknown (no end time stamp)

Hey thanks, this is exactly what I was looking for......but I have an issue if I write this same piece of awk to a script it's throwing input file read error. but if I do this same from command line no issue at all. in script and command line I'm using same input file but not sure why this awk throws input file read error in script while not thru command line.

Don_Cragun · December 26, 2012, 1:08pm

I'm not sure what you mean by

or by

.

If you save the code I provided in a file named compare_time ,
change /bin/ksh in #!/bin/ksh to be the absolute pathname of ksh on your system, run the command:

chmod +x compare_time

and then run the command:

./compare_time

it should work exactly like it works if you paste the script into an interactive Korn shell.

If you mean that you want to put the script into a file that can be used with awk's -f option, then create a file named compare_time.awk containing the following:

BEGIN { FS = "  "
}
{
        if(split($1, nf, /[:.]/) != 4) {
                printf "Time stamp split for \"%s\" on line %d failed\n", $1, NR
                exit 1
        }
        if(NR > 1) {
                usec = nf[4] - lf[4]
                sec = nf[3] - lf[3]
                min = nf[2] - lf[2]
                hr = nf[1] - lf[1]
                if(usec < 0) {usec += 1000000; sec--}
                if(sec < 0) {sec += 60; min--}
                if(min < 0) {min += 60; hr--}
                if(hr < 0) hr += 24
                printf "%2d:%02d:%02d.%06d\n", hr, min, sec, usec
        }
        for(i = 1; i <= 4; i++) lf = nf
        printf "%20s: ", $2
}
END {   printf "unknown (no end time stamp)\n"
}

and then use it as follows:

awk -f compare_time.awk input_filename

If these suggestions don't take care of the issue, please provide the exact message or messages being written by awk that tell you that awk is throwing an input file read error, and provide the exact command line that you're using to invoke awk. (As mentioned in an earlier message, if you're running this on a Solaris system, use /usr/xpg4/bin/awk or nawk instead of awk.)

manas_ranjan · December 27, 2012, 8:27am

don cragun:

I'm not sure what you mean by or by .

If you save the code I provided in a file named compare_time ,
change /bin/ksh in #!/bin/ksh to be the absolute pathname of ksh on your system, run the command:
chmod +x compare_time
and then run the command:
./compare_time
it should work exactly like it works if you paste the script into an interactive Korn shell.

If you mean that you want to put the script into a file that can be used with awk's -f option, then create a file named compare_time.awk containing the following:
BEGIN { FS = "  "
}
{
   if(split($1, nf, /[:.]/) != 4) {
   printf "Time stamp split for \"%s\" on line %d failed\n", $1, NR
   exit 1
   }
   if(NR > 1) {
   usec = nf[4] - lf[4]
   sec = nf[3] - lf[3]
   min = nf[2] - lf[2]
   hr = nf[1] - lf[1]
   if(usec < 0) {usec += 1000000; sec--}
   if(sec < 0) {sec += 60; min--}
   if(min < 0) {min += 60; hr--}
   if(hr < 0) hr += 24
   printf "%2d:%02d:%02d.%06d\n", hr, min, sec, usec
   }
   for(i = 1; i <= 4; i++) lf = nf
   printf "%20s: ", $2
}
END {   printf "(no end time stamp)\n"
}
and then use it as follows:
awk -f compare_time.awk input_filename
If these suggestions don't take care of the issue, please provide the exact message or messages being written by awk that tell you that awk is throwing an input file read error, and provide the exact command line that you're using to invoke awk. (As mentioned in an earlier message, if you're running this on a Solaris system, use /usr/xpg4/bin/awk or nawk instead of awk.)

---------- Post updated at 08:27 AM ---------- Previous update was at 08:26 AM ----------

don cragun:

I tried using floating point (as Jim suggested), but with some additional testing I found the results were occasionally off by a microsecond. The following seems to work even when time stamps roll over to the next day. (It still assumes that there is always less than 24 hours between adjacent time stamps.) If you're running on a Solaris system, use nawk or /usr/xpg4/bin/awk instead of awk:

awk -F '  ' '{  
   if(split($1, nf, /[:.]/) != 4) {
   printf "Time stamp split for \"%s\" on line %d failed\n", $1, NR
   exit 1 
   }
   if(NR > 1) {
   usec = nf[4] - lf[4]
   sec = nf[3] - lf[3]
   min = nf[2] - lf[2]
   hr = nf[1] - lf[1]
   if(usec < 0) {usec += 1000000; sec--}
   if(sec < 0) {sec += 60; min--}
   if(min < 0) {min += 60; hr--}
   if(hr < 0) hr += 24
   printf "%2d:%02d:%02d.%06d\n", hr, min, sec, usec
   }
   for(i = 1; i <= 4; i++) lf = nf
   printf "%20s: ", $2
}
END {   printf "(no end time stamp)\n"
}' in

with the file in containing:

15:09:50.350038  Reading A Data
15:09:50.371645  Reading B Data
15:10:55.655724  Initializing models
15:11:31.320920  Preparing Simulation
15:11:32.763217  Running Calculation
15:15:29.668882  Aggregating Results
15:15:29.950897  Persisting Results
23:59:59.000000  1 minute to midnight
00:00:00.000000  midnight
23:59:59.999999  eod
00:00:00.000001  early
00:00:00.000000  next midnight

the ouput produced is:

   Reading A Data:  0:00:00.021607
   Reading B Data:  0:01:05.284079
 Initializing models:  0:00:35.665196
Preparing Simulation:  0:00:01.442297
 Running Calculation:  0:03:56.905665
 Aggregating Results:  0:00:00.282015
  Persisting Results:  8:44:29.049103
1 minute to midnight:  0:00:01.000000
   midnight: 23:59:59.999999
   eod:  0:00:00.000002
   early: 23:59:59.999999
   next midnight: unknown (no end time stamp)

yep, I did little mistake, that's the reason for asking invalid file read error.
thanks again