Shell Script for formatted output

Sandeep_Behera · July 27, 2017, 6:54am

06/26/2017 23:40:40       CAUAJM_I_10082 [aspsun14 connected for IOALPPRXXBD_ALPGLGENFAALL 55443.15215291.1]
 06/26/2017 23:40:40       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: STARTING        JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/26/2017 23:40:42       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: RUNNING         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/26/2017 23:49:19       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: SUCCESS         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14        EXITCODE:  0
 06/27/2017 23:40:23       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: STARTING        JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/27/2017 23:40:24       CAUAJM_I_10082 [aspsun14 connected for IOALPPRXXBD_ALPGLGENFAALL 55443.15236942.1]
 06/27/2017 23:40:25       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: RUNNING         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/27/2017 23:48:19       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: SUCCESS         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14        EXITCODE:  0
 06/28/2017 23:41:36       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: STARTING        JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/28/2017 23:41:37       CAUAJM_I_10082 [aspsun14 connected for IOALPPRXXBD_ALPGLGENFAALL 55443.15258301.1]
 06/28/2017 23:41:38       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: RUNNING         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/28/2017 23:48:47       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: SUCCESS         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14        EXITCODE:  0

I have a file having above content, I want the output like, job name then start time then End time

IOALPPRXXBD_ALPGLGENFAALL 06/26/2017 23:40:40 06/26/2017 23:49:19
IOALPPRXXBD_ALPGLGENFAALL 06/27/2017 23:40:23 06/27/2017 23:48:19
IOALPPRXXBD_ALPGLGENFAALL 06/28/2017 23:41:36 06/28/2017 23:48:47

could somebody help me with a shell script for this requirement. I am using ksh in AIX environment

jim_mcnamara · July 27, 2017, 8:16am

Good, well formed question for a first effort.

What have you tried so far?
Can you show us where you are stuck? The point of the forums is to help you learn, not to write code for you.

Sandeep_Behera · July 28, 2017, 7:01am

#!bin/ksh

job_name=$1
echo $job_name

ls /Autosys/CA/UnicenterAutoSysJM/autouser.ACE/out/event_demon.ACE* | sort -r | awk -F "/" '{print $7}' > logfile.txt

path="/Autosys/CA/UnicenterAutoSysJM/autouser.ACE/out"


 while read line;

 do

                            zgrep $job_name $path/$line  | sed -e 's/\[/ /' -e 's/\]/ /'  >> feoj.txt



  done < logfile.txt

cat feoj.txt | sort -k1 | awk '/START|SUCCESS/ { print $1,$2}'

Thanks for the reply. I have tried this but the out put is in one column, please give me some hint so that I can try.

rbatte1 · July 28, 2017, 7:48am

A few pointers to tidy up first:-

Please wrap CODE tags & around code, files, input & output/errors.
The append to feoj.txt can be added to the end of the loop
There is no need to use cat to pipe the file, just name the file as part of the command
Putting the awk before the sort will mean sort has less processing to do.
If you don't need to store the file, you can avoid the IO altogether.
What are the full filenames for the log files? The logfiles should be in order if they have a correct timestamp in the name.
If they are just aged my moving them through the suffixes of -1 , -2 , -3 etc., then we can adjust for that.

Doing just these (but not removing the final sort) gives me this:-

#!bin/ksh

job_name=$1
echo $job_name

path="/Autosys/CA/UnicenterAutoSysJM/autouser.ACE/out"

cd $path                        # ..... or 'pushd $path' if you prefer so you can 'popd' later if needed

for filename in event_demon.ACE*
do
   zgrep $job_name $filename  | sed -e 's/\[/ /' -e 's/\]/ /'
done  | awk '/START|SUCCESS/ { print $1,$2}' | sort -k1

Now that it's a bit neater, where is it going wrong?

Kind regards,
Robin

bakunin · July 28, 2017, 8:01am

I am quite enthused you left something over for me, dear colleague

should of course be:

#! /bin/ksh

I hope this helps.

bakunin

Sandeep_Behera · July 28, 2017, 8:16am

yes it works, but my concern with the output format.

How can I display the value of variable side by side like below :

Job Name                                  Start Time                 End Time
IOALPPRXXBD_ALPGLGENFAALL 06/26/2017 23:40:40 06/26/2017 23:49:19
IOALPPRXXBD_ALPGLGENFAALL 06/27/2017 23:40:23 06/27/2017 23:48:19
IOALPPRXXBD_ALPGLGENFAALL 06/28/2017 23:41:36 06/28/2017 23:48:47

rbatte1 · July 28, 2017, 8:19am

Can you show us (wrapped in CODE tags please) what you get so far. We can then think about how to progress. Otherwise we're a bit blind.

What do you want to happen if there is not a closing SUCCESS record in the log file?

Regards,
Robin

Sandeep_Behera · July 29, 2017, 8:33am

Below is the out put of wrapped in CODE:

IOALPPRXXBD_ALPGLGENFAALL
06/26/2017 23:40:40
06/26/2017 23:49:19
06/27/2017 23:40:23
06/27/2017 23:48:19
06/28/2017 23:41:36
06/28/2017 23:48:47
06/29/2017 23:40:55
06/29/2017 23:50:27
07/01/2017 00:34:52
07/01/2017 00:45:50
07/04/2017 00:27:59
07/04/2017 00:36:18
07/05/2017 23:40:47
07/05/2017 23:48:56
07/06/2017 23:40:41
07/06/2017 23:50:15
07/07/2017 23:40:48
07/07/2017 23:51:10
07/10/2017 23:41:03
07/10/2017 23:54:41
07/11/2017 23:40:30
07/11/2017 23:52:19
07/12/2017 23:40:43
07/12/2017 23:48:29
07/13/2017 23:41:04
07/13/2017 23:49:59
07/14/2017 23:40:44
07/14/2017 23:50:41
07/17/2017 23:40:45
07/17/2017 23:49:50
07/18/2017 23:40:52
07/18/2017 23:48:43
07/19/2017 23:40:26
07/19/2017 23:49:10
07/20/2017 23:40:46
07/20/2017 23:50:11
07/21/2017 23:40:48
07/21/2017 23:51:21
07/24/2017 23:40:25
07/24/2017 23:49:04
07/25/2017 23:40:44
07/25/2017 23:48:21
07/26/2017 23:41:09
07/26/2017 23:48:30
07/27/2017 23:40:35
07/27/2017 23:50:31
07/28/2017 23:41:23
07/28/2017 23:51:17

But how can I display the value side by side, I am getting the data from a logfile when the job starts and job ends , so the out put should be :

Job Name                               Start Time                 End Time
IOALPPRXXBD_ALPGLGENFAALL 06/26/2017 23:40:40 06/26/2017 23:49:19
IOALPPRXXBD_ALPGLGENFAALL 06/27/2017 23:40:23 06/27/2017 23:48:19
IOALPPRXXBD_ALPGLGENFAALL 06/28/2017 23:41:36 06/28/2017 23:48:47

Any hint please ?

bakunin · July 29, 2017, 10:18pm

This is actually quite easy, you just need to think things through first. Lets go back to your original file:

06/26/2017 23:40:40       CAUAJM_I_10082 [aspsun14 connected for IOALPPRXXBD_ALPGLGENFAALL 55443.15215291.1]
 06/26/2017 23:40:40       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: STARTING        JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/26/2017 23:40:42       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: RUNNING         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/26/2017 23:49:19       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: SUCCESS         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14        EXITCODE:  0
 06/27/2017 23:40:23       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: STARTING        JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/27/2017 23:40:24       CAUAJM_I_10082 [aspsun14 connected for IOALPPRXXBD_ALPGLGENFAALL 55443.15236942.1]
 06/27/2017 23:40:25       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: RUNNING         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/27/2017 23:48:19       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: SUCCESS         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14        EXITCODE:  0
 06/28/2017 23:41:36       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: STARTING        JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/28/2017 23:41:37       CAUAJM_I_10082 [aspsun14 connected for IOALPPRXXBD_ALPGLGENFAALL 55443.15258301.1]
 06/28/2017 23:41:38       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: RUNNING         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14
 06/28/2017 23:48:47       CAUAJM_I_40245 EVENT: CHANGE_STATUS    STATUS: SUCCESS         JOB: IOALPPRXXBD_ALPGLGENFAALL MACHINE: aspsun14        EXITCODE:  0

Of this file you only need the lines containing "STATUS: STARTING" and some ending notification. This could perhaps be "STATUS: SUCCESS", but somehow i don't believe that all jobs end that way - what other possible ending codes do you have, because your script will need to cover them too.

Let us go on, assuming for the moment that the only endng condition is "SUCCESS", this can be corrected later.

Next we make up some "rules" what to do with the respective lines, because at some point we need to read our input line by line:

When we encounter a "STARTING" line we need to remember two things, the job name and the timestamp.
When we encounter a "SUCCESS"-line we need to read also two things: the job name and the timestamp. But we don't need to store ("remember") them: the job name should be searched in our list of remembered (=started) jobs. If it is found, the stored starting time, the end time and the job name is printed.

The last point begs two questions: what are we going to do if we encounter a job with a start but no end? And what are we going to do with jobs with an end but no start?

Here is the skeleton of a shell script that implements what i said above. Neither is it very clever nor very mature, its intention to make it obvious how to implement common reasing like above into code. It also won't take the raised questions into account and implicitly assume that all jobs end with success and every starting job also ends and vice versa.

We start by filtering and displaying only the lines we are interested in:

#! /bin/ksh

typeset infile="/path/to/some/file"      # file we read from
typeset cond=""                               # condition of the job, "STARTING" or "SUCCESS"
typeset name=""                              # jobs name

grep -e "STATUS: SUCCESS" -e "STATUS: STARTING" "$infile" |\
while read junk junk junk junk junk junk cond junk name junk ; do
     echo "name is: $name   condition is: $cond"
done

exit 0

Now let that run and watch if: a) the lines are filtered correctly and b) the names and conditions are displayed correctly. You can let the shell split the input lines and use different variables to distribute the split input to (all the ones i am not interested in i name "junk" out of habit), but it is easy to get the number of fields wrong in long lines. So try and verify it before going on. It is good practice make sure everything is correct so far before going on.

Next thing is to implement the "rules" we identified above. First the STARTING-lines:

We need to remember the date and time, so two of the fields not interested in before are now in a variable (junk->date, junk->time).

#! /bin/ksh

typeset infile="/path/to/some/file"      # file we read from
typeset date=""                               # jobs date
typeset time=""                               # jobs tme
typeset cond=""                               # condition of the job, "STARTING" or "SUCCESS"
typeset name=""                              # jobs name
count=1                                          # counter for the storage arrays
# aname[]                                      # these arrays hold the remembered jobs
# atime[]

grep -e "STATUS: SUCCESS" -e "STATUS: STARTING" "$infile" |\
while read date time junk junk junk junk cond junk name junk ; do

     echo "name is: $name   condition is: $cond  time is: $date $time"       # we leave that in for now, to see what the script works on

     case $cond in
          STARTING)
               typeset aname[$count]="$name"
               typeset atime[$count]="$date $time"
               (( count += 1 ))
               ;;

     esac
done

exit 0

You see we just add to the arrays whe we find a new job, all very easy. The next rule is a a little trickier. Upon encountering such a line we need to search our array for the respective job entry, then print both starting and ending times:

#! /bin/ksh

typeset    infile="/path/to/some/file"      # file we read from
typeset    date=""                               # jobs date
typeset    time=""                               # jobs tme
typeset    cond=""                               # condition of the job, "STARTING" or "SUCCESS"
typeset    name=""                              # jobs name
typeset -i count=1                               # counter for the newest element of the storage arrays
typeset -i i=1                                     # counter for searching the storage arrays
# aname[]                                      # these arrays hold the remembered jobs
# atime[]

grep -e "STATUS: SUCCESS" -e "STATUS: STARTING" "$infile" |\
while read date time junk junk junk junk cond junk name junk ; do

     echo "name is: $name   condition is: $cond  time is: $date $time"       # we leave that in for now, to see what the script works on

     case $cond in
          STARTING)
               typeset aname[$count]="$name"
               typeset atime[$count]="$date $time"
               (( count += 1 ))
               ;;

          SUCCESS)
               (( i = 1 ))
               while [ $i -lt ${#aname[@]} ] ; do               # walk though the array
                    if [ "${aname[$i]}" = "$name" ] ; then     # we found the corresponding entry
                         print "${aname[$i]} \t${atime[$i]} \t $date $time"
                    else
                         (( i += 1 ))
                    fi
               done
               ;;

     esac
done

exit 0

Again, this is not meant to be put in production as it is. But analysing how we arrived at it and how it works should give you the idea how to create a proper script for your purpose.

I hope this helps.

bakunin

Sandeep_Behera · August 11, 2017, 7:31am

Sorry for the late response as I am out for some personal emergency. Thanks for your help, I am reviewing this now and will update you soon. Thanks again.

---------- Post updated at 07:31 AM ---------- Previous update was at 07:03 AM ----------

Of this file you only need the lines containing "STATUS: STARTING" and some ending notification. This could perhaps be "STATUS: SUCCESS", but somehow i don't believe that all jobs end that way - what other possible ending codes do you have, because your script will need to cover them too.

Let us go on, assuming for the moment that the only endng condition is "SUCCESS", this can be corrected later.

Next we make up some "rules" what to do with the respective lines, because at some point we need to read our input line by line:

When we encounter a "STARTING" line we need to remember two things, the job name and the timestamp.
When we encounter a "SUCCESS"-line we need to read also two things: the job name and the timestamp. But we don't need to store ("remember") them: the job name should be searched in our list of remembered (=started) jobs. If it is found, the stored starting time, the end time and the job name is printed.

The last point begs two questions: what are we going to do if we encounter a job with a start but no end? And what are we going to do with jobs with an end but no start?

Here is the skeleton of a shell script that implements what i said above. Neither is it very clever nor very mature, its intention to make it obvious how to implement common reasing like above into code. It also won't take the raised questions into account and implicitly assume that all jobs end with success and every starting job also ends and vice versa.

We start by filtering and displaying only the lines we are interested in:

++++++++++++++++++++++++++
++++++++++++++++++++++++++

Your above reply is very much correct and matches the exact requirement.

I tried the above script but it searches STARTING and SUCCESS for all job, whereas I need for specific job.
Also could you please tell what junk means ?
Can I use job name as $1 (1st arguement), so that it can changes as per requirement

#! /bin/ksh

name=$1

typeset    infile="/home/sbehera/out/event_demon.ACE*"      # file we read from
typeset    cond="SUCCESS"                               # condition of the job, "STARTING" or "SUCCESS"
typeset -i count=1                               # counter for the newest element of the storage arrays
typeset -i i=1                                     # counter for searching the storage arrays

echo $infile $cond $name $count $i


zgrep ""STATUS: SUCCESS"  "STATUS: STARTING"" "$infile" |\
while read date time junk junk junk junk cond junk name junk ; do

     echo "name is: $name   condition is: $cond  time is: $date $time"       # we leave that in for now, to see what the script works on


case $cond in
          STARTING)
               typeset aname[$count]="$name"
               typeset atime[$count]="$date $time"
               (( count += 1 ))
               ;;

          SUCCESS)
               (( i = 1 ))
               while [ $i -lt ${#aname[@]} ] ; do               # walk though the array
                    if [ "${aname[$i]}" = "$name" ] ; then     # we found the corresponding entry
                         print "${aname[$i]} \t${atime[$i]} \t $date $time"
                    else
                         (( i += 1 ))
                    fi
               done
               ;;

     esac
done

exit 0




Output:

$ sh run.sh IOXXXXXMO_XXXILEO
/home/sbehera/out/event_demon.ACE 
/home/sbehera/out/event_demon.ACE.07092017.gz 
/home/sbehera/out/event_demon.ACE.07102017.gz 
/home/sbehera/out/event_demon.ACE.07112017.gz 
/home/sbehera/out/event_demon.ACE.07122017.gz 
/home/sbehera/out/event_demon.ACE.07132017.gz 
/home/sbehera/out/event_demon.ACE.07142017.gz 
/home/sbehera/out/event_demon.ACE.08102017.gz SUCCESS IOXXXXXMO_XXXILEO 1 1
SUCCESS  STATUS:.gz: No such file or directory
STARTING.gz: No such file or directory
/home/sbehera//out/event_demon.ACE*.gz: No such file or directory

RudiC · August 11, 2017, 8:07am

Some comments:

yes, you can introduce $1 as the job name to be discriminated in the input stream. Use if - then - else constructs for the evaluation.
don't double-use variables (here: name ) as it leads at least to confusion if not malfunction of the script.
the inconsistent / faulty double quoting in the zgrep command might lead to the "No such file or directory" error. On top, I suspect DOS line terminators It causes "SUCCESS STATUS" to be the second parameter and thus interpreted as a file name.
junk is a dummy variable to read and discard unneeded fields in the input line.

EDIT: It is not the DOS line terminator problem; corrected in above