Restarting shell script in case of a failure

I have a shell script which sqoops data from one place to hive and it does in 2 groups list. If one group is completed it waits for the second to get completed then goto to orc load, but it it fails it kills the other groups and sends a fail email. What I was looking for is if 1 group fails, it should restart that failed group instead of killing the other groups, thereby saving time in rerunning the whole job.

Following is the code:
What i was looking for is, instead of killing the group 2 process it should rerun the shell script(The sh ${HOME_DIR}/Sqoop.SH line) to avoid re-running the whole job. Request you to ask in case the question is unclear.

just an idea: instead of having a loop monitoring the jobs while they run, add a wait after the background jobs are submitted then add a loop to look for failed jobs in the log files and resubmit only those jobs. Keep looping until all jobs complete normally.

1 Like