Need to extract jil file details in a excelsheet

I am very new to shell scripting.

I have a autosys jil file that looks like :--

/* ------------- JOB1 ------------------ */

insert_job: JOB1    job_type:  b
owner:     cm@pelonmuck
permission: gx,ge,wx,we,mx,me
date_conditions: 1
days_of_week: mo,tu,we,th,fr,su
start_time: "18:30"
box_success: s(SOME_JOB1) and s(SOME_JOB2) and s(SOME_JOB3)
box_failure: (f(SOME_JOB1) or f(SOME_JOB2)) & f(SOME_JOB3)
description: "pull files"
max_run_alarm: 15
alarm_if_fail: 1
timezone: US/Eastern


/* ------------- JOB2 ------------------ */

insert_job: JOB2    job_type:  c
box_name: SOME_BOX_NAME
command: /usr/bin/run /usr/cache/START_JOB
machine: machine@conti.com
owner:     cm@pelonmuck        
permission: gx,ge,wx,we,mx,me
days_of_week: sa,su
description: "pull all files"
max_run_alarm: 15
alarm_if_fail: 1
timezone: GMT

I need to create a shell script to get the output in a excelsheet/csv format like below:-

insert_job,machine,date_conditions,days_of_week,start_time,timezone,description,command,alarm_if_fail
JOB1,,1,mo,tu,we,th,fr,su,"18:30",US/Eastern,"pull files",,1
JOB2,machine@conti.com,,sa,su,,GMT,"pull all files",/usr/bin/run /usr/cache/START_JOB,1

Could you all pls help me to build this.

Any attempts / ideas / thoughts from your side?

Hi RudiC, Yeah , I guess we need to parse the file first and look out for those compulsory columns name in the output and then pullout the values next to that ....

What have you tried?

I have reached quite close thanks to a already existing thread

awk -F ' *[[:alnum:]_]*: *' 'BEGIN         {h="insert_job;box_name;command;owner;permission;condition;description;std_out_file;std_err_file;alarm_if_fail"; print h; n=split(h,F,/;/)}
                             function pr() {if(F[1] in A) {for(i=1;i<=n;i++)printf "%s%s",A[F],(i<n)?";":RS}}
                             /insert_job/  {pr(); delete A}
                                           {for(i in F){if($0~"^"F)A[F]=$2}}
                             END           {pr()}' infile > outfile.csv

The issue with above code is, it gives me the output as below:-

insert_job,machine,date_conditions,days_of_week,start_time,timezone,description,command,alarm_if_fail
JOB1;;1,mo,tu,we,th,fr,su;";US/Eastern;"pull files";;1
JOB2;machine@conti.com;;sa,su;;GMT;"pull all files";/usr/bin/run /usr/cache/START_JOB;1

Its not able to print the start_time in first row, which should be "18:30"
Any idea, how do we print the start_time which is in int format

Try

awk -F: '
NR==1           {HD = "insert_job,machine,date_conditions,days_of_week,start_time,timezone,description,command,alarm_if_fail"
                 for (HDCnt=i=split(HD, HDArr, OFS); i>0; i--) SRCH[HDArr] 
                 print HD
                }

function PRT()  {for (i=1; i<=HDCnt; i++)       {printf "%s%s", RES[HDArr], i<HDCnt?OFS:ORS
                                                }
                 split ("", RES)
                }

/--- JOB/       {if (PR) PRT()
                 PR=1
                }

$1 in SRCH      {T = $1
                 sub ($1 FS " *", "")
                 sub (/  +.*$/, "")
                 RES[T] = $0
                }

END             {PRT()
                }
' OFS=","  file
insert_job,machine,date_conditions,days_of_week,start_time,timezone,description,command,alarm_if_fail
JOB1,,1,mo,tu,we,th,fr,su,"18:30",US/Eastern,"pull files",,1
JOB2,machine@conti.com,,sa,su,,GMT,"pull all files",/usr/bin/run /usr/cache/START_JOB,1
1 Like

Thanks, I have one final hurdle.

The script you gave on top works like magic but only where the jil details does not have leading spaces.
For example for a jil file containing below jil detail:-

  /* ------------- JOB1 ------------------ */

  insert_job: JOB1	job_type:  b
  owner: 	cm@pelonmuck
  permission: gx,ge,wx,we,mx,me
  date_conditions: 1
  days_of_week: mo,tu,we,th,fr,su
  start_time: "18:30"
  box_success: s(SOME_JOB1) and s(SOME_JOB2) and s(SOME_JOB3)
  box_failure: (f(SOME_JOB1) or f(SOME_JOB2)) & f(SOME_JOB3)
  description: "pull files"
  max_run_alarm: 15
  alarm_if_fail: 1
  timezone: US/Eastern

the output will not have anything other than the header because of leading spaces, my infile has leading spaces in multiple places(not more than 2 or 3 spaces) and those details are just skipped by the script.
Can you please help.

Try

awk -F: '
NR==1           {HD = "insert_job,machine,date_conditions,days_of_week,start_time,timezone,description,command,alarm_if_fail"
                 for (HDCnt=i=split(HD, HDArr, OFS); i>0; i--) SRCH[HDArr] 
                 print HD
                }

function PRT()  {for (i=1; i<=HDCnt; i++)       {printf "%s%s", RES[HDArr], i<HDCnt?OFS:ORS
                                                }
                 split ("", RES)
                }

/--- JOB/       {if (PR) PRT()
                 PR=1
                }

                {sub ("^  *", _)
                }

$1 in SRCH      {T = $1
                 sub ($1 FS " *", "")
                 sub (/[	 ][	 ]+.*$/, "")
                 RES[T] = $0
                }

END             {PRT()
                }
' OFS=","  file

Please be aware that the structure of the line containing insert_job differs from the data in post#1 (<TAB> separated, not spaces), so the splitting off of the job_type doesn't work any more.

Hi Rudi, would it be possible for you to explain how this script works? Im still learning awk and theres a lot i dont understand in this script?
Many thanks

That proposal is only a slight adaption of the script you posted as your attempt. How about you try to explain it here, and we jump in on the gaps that you can't cover?