The actual start/end times & actaul start/end dates are coming from the "Process time" column.I only want the data above and don't want any of the text including the "----" to be anywhere in the file I output it to. As mentioned above, I have a few hundred of these definitions in a single file.
This was something I was originally doing in python and am now going to try to do it using awk.
I know to read in the file it would be:
awk /dir/filepath/input.txt
And it output the file, I need:
System Number Job Name Target Machiene Status Actual Start Date Actual Start Time Actual End Date Actual End Time
9043 B9043CC_APP_DMLD_025_FR_xpabbdu1_D machine.enviorment.net SUCCESS 03/12/2014 17:30:53 03/12/2014 17:31:47
9043 B9043CC_APP_DMLD_025_FR_xpabbdu1_D machine.enviorment.net FAILURE 03/12/2014 18:16:07 03/12/2014 18:17:03
9043 B9043CC_APP_DMLD_025_FR_xpabbdu1_D machine.enviorment.net SUCCESS 03/12/2014 18:21:19 03/12/2014 18:22:08
> /dir/filepath/output.txt
However, I'm looking for help with regards to the parsing aspect.
Job Name Last Start Last End ST Run Pri/Xit
________________________________________________________________ ____________________ ____________________ __ _______ ___
B9043CC_APP_DMLD_025_FR_xpabbdu1_D 03/12/2014 18:21:32 03/12/2014 18:22:07 SU 49744331/3
Can you tell us a bit more about the input file format:
I understand this is an extract, does it correspond to a specific job log, or we might find the same job later? etc...
Anything to clear how the parsing will work:
e.g.
Will we have to look 3rd line after we find "^Job Name " to find the string containing the System Number ( will always be the case...)?
Read six lines (header)
Get system number and batch name
Until end of file:
Read five lines
Get machine name, status, start and end dates and times
If status is FAILURE
Read two lines (clear error message)
No, duplicate job names will be present, however jobs will contain the same system numbers.
Also, since some jobs may have have ran on that specific day so there will be no data in them. In this case the fields in the output file would just be empty or null.
I.E
Job Name Last Start Last End ST Run Pri/tit
____________________________ ____________________ ____________________ __ _______ ___
B9043CC_uwsprem_l_thd013sv_D 08/04/2010 22:03:55 03/05/2012 07:51:33 OI 22333537/0
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- -------
B9043CC_uwsprem_l_thd024sv_D 03/06/2012 22:00:34 03/06/2012 22:00:42 OI 22333536/1
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- -------
B9043BC_bond_ba_mf_loss_thd013sv_D 03/06/2012 08:54:11 03/06/2012 11:44:06 OI 22303721/1
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
[STARTJOB] 03/19/2014 17:45:00 0 PD 03/19/2014 17:45:00
<Event was Scheduled based on Job Definition.>
B9043CC_bcmsloss_l_thd013sv_D 03/21/2014 08:46:48 03/21/2014 10:38:31 SU 22303721/110
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
SUCCESS 03/19/2014 14:04:49 108 PD 03/19/2014 14:04:49
[FORCE_STARTJOB] 03/20/2014 13:39:15 0 PD 03/20/2014 13:39:15
< >
STARTING 03/20/2014 13:39:15 109 PD 03/20/2014 13:39:16 machine.enviorment.net
RUNNING 03/20/2014 13:39:17 109 PD 03/20/2014 13:39:17 machine.enviorment.net
SUCCESS 03/20/2014 14:24:56 109 PD 03/20/2014 14:24:56
[FORCE_STARTJOB] 03/21/2014 08:46:47 0 PD 03/21/2014 08:46:47
< >
STARTING 03/21/2014 08:46:47 110 PD 03/21/2014 08:46:48 tmachine.enviorment.net
RUNNING 03/21/2014 08:46:48 110 PD 03/21/2014 08:46:49 machine.enviorment.net
SUCCESS 03/21/2014 10:38:31 110 PD 03/21/2014 10:38:31
Thanks! I ust ran the script against the data below. I have multiple of these jobs in one file, so every job has a different job name which I want to grab, even if the job did not run. It is my fault for not mentioning this in the original post. I just ran the script against the data below and it only pulling the first job name it sees for each entry, am I am trying to modify that.
B3709BC_GCFCT_MONTHLY_tpabbtu1_D 03/12/2014 09:13:23 03/13/2014 00:43:10 FA 54759595/1 1
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
RUNNING 03/12/2014 09:13:23 1 PD 03/12/2014 09:13:24
FAILURE 03/13/2014 00:43:10 1 PD 03/13/2014 00:43:11
[STARTJOB] 03/26/2014 18:45:00 0 UP
<Event was Scheduled based on Job Definition.>
B3709CC_GCFCT_MONTHLY_VALIDATION_tpabbtu1_D 03/12/2014 10:59:52 03/12/2014 11:01:11 SU 54759595/1
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
[FORCE_STARTJOB] 03/12/2014 10:59:46 0 PD 03/12/2014 10:59:46
< >
STARTING 03/12/2014 10:59:46 1 PD 03/12/2014 10:59:46 machine.enviorment.net
RUNNING 03/12/2014 10:59:52 1 PD 03/12/2014 10:59:52 machine.enviorment.net
SUCCESS 03/12/2014 11:01:11 1 PD 03/12/2014 11:01:11
B3709CC_GCFCT_Monthly_LKUP_Creation_tpabbtu1_D 03/12/2014 10:24:43 03/12/2014 10:27:57 SU 54759595/1
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
[FORCE_STARTJOB] 03/12/2014 10:24:37 0 PD 03/12/2014 10:24:37
< >
STARTING 03/12/2014 10:24:37 1 PD 03/12/2014 10:24:38 machine.enviorment.net
RUNNING 03/12/2014 10:24:43 1 PD 03/12/2014 10:24:44 machine.enviorment.net
SUCCESS 03/12/2014 10:27:57 1 PD 03/12/2014 10:27:58
B3709CC_GCFCT_IP_Target_Load_tpabbtu1_D 04/11/2013 15:42:10 04/11/2013 15:45:31 IN 39115173/0
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
B3709CC_GCFCT_ERROR_PROCESSING_tpabbtu1_D 04/11/2013 15:45:41 04/11/2013 16:45:42 IN 39115173/0
Status/[Event] Time Ntry ES ProcessTime Machine
-------------- --------------------- -- -- --------------------- ----------------------------------------
output:
System Number Job Name Target Machine Status Actual Start Date Actual Start Time Actual End Date Actual End Time
3709 B3709BC_GCFCT_MONTHLY_tpabbtu1_D FAILURE 03/13/2014 00:43:10
3709 B3709BC_GCFCT_MONTHLY_tpabbtu1_D machine.enviorment.net SUCCESS 03/12/2014 10:59:46 03/12/2014 11:01:11
3709 B3709BC_GCFCT_MONTHLY_tpabbtu1_D machine.enviorment.net SUCCESS 03/12/2014 10:24:37 03/12/2014 10:27:57
Targetd output:
System Number Job Name Target Machine Status Actual Start Date Actual Start Time Actual End Date Actual End Time
3709 B3709BC_GCFCT_MONTHLY_tpabbtu1_D FAILURE 03/13/2014 00:43:10
3709 B3709CC_GCFCT_MONTHLY_VALIDATION_tpabbtu1_D machine.enviorment.net SUCCESS 03/12/2014 10:59:46 03/12/2014 11:01:11
3709 B3709CC_GCFCT_Monthly_LKUP_Creation_tpabbtu1_D machine.enviorment.net SUCCESS 03/12/2014 10:24:37 03/12/2014 10:27:57
3709 B3709CC_GCFCT_IP_Target_Load_tpabbtu1_
3709 B3709CC_GCFCT_ERROR_PROCESSING_tpabbtu1_D