Trap the EXIT_CODE from a script

Hi All,

I have a script which calls SQLPLUS and do some data cleanup. But sometimes the SQL hangs and the script keeps on running.
In that case, we kill the script using "kill" command, but as soon as we kill the script it exists with a non zero exit code which makes the job fail( we have a checking for non zero exit code).
this we want to avoid. Is there any way, where we can trap the exit code and change it to 0 so that it completes successfully.
I want to accomplish it without making changes in the script., I would like to do it from outside from the same unix console.

Can you please let me know if it's possible? like using the TRAP command?

Could you share the script, or at least a cut-down version that illustrates your issue? (perhaps with a long sleep in it)

If not, then it's a bit difficult to help.

Kind regards,
Robin

I'm afraid you won't be able to do it without changing the script. Yes, you can trap e.g. the SIGTERM signal and exit gracefully, but you need to make sure you intercept all ramifications.

Good comment from Robin.

Instead of using SIGTERM you can use other signal.
Following comment is only a sample:
C or CPP code using SIGUSR1 signal

  1. when receiving SIGUSR1 it is writing information to log file
  2. executing the script using system call in child process.
  3. parent process waiting for child process
  4. when parent process receive SIGUSR1 signal, writing current status to log file.
1 Like

I beg to differ: the script in question has to be called from somewhere to be started. Write a "wrapper script" so that your script is called which always exits with 0 regardless of what happens. Then replace the call of the original script with the wrapper script. The exit code for the original script would be reported to the wrapper script and nobody else. If this chooses to ignore it than it is ignored.

To be honest, doing it by changing the original script as RudiC suggested would be by far the cleaner and better solution still. But in case you can't change the script for whatever reason this might be a work-around.

I hope this helps.

bakunin

1 Like

Yes, that - if I read post#1 correctly - is a job, i.e. a script, program, or cron entry, calling the script in question, neither of which shalt be modified.

The only way to do so would be to rename the original script and put a wrapper script with its name in its place, and then call the new name - horrifying picture to be documented.

But - as long as we don't get additional context info all this is sheer guesswork.

1 Like

Of course, renaming the original might cause problems. I had to intercept calls to usermod etc. on an old HPUX server once so we could log what was being called. The original (compiled) code was hard-linked for several functions (usermod, useradd, etc.) and reacted differently depending on what name was used to call it.

Perhaps we could suggest many ways of accomplishing this (e.g. with trap or other coding changes) but until we can see what is being called, we're all going to be stuck.

Assuming something checks the value of $? then you could simply add a : or true after the call that may get killed off. Of course, then you don't know when it may have genuinely failed. We need to see (enough of) your code to give you something useful.

Kind regards,
Robin

the V$LOCK database shows deadlocks. Most oracle "hangs" are caused by another, usually interactive process, that has a previous lock on a row. Your code performs DDL which does implicit locking (update, insert, etc.), so it is susceptible. Have your DBA clobber the offending process.

Example: We had lots of users who left for lunch in the middle of an update/insert Oracle form. Since we could not rewrite hundreds of forms to be better about transactions and locking, the DBA intervened and killed off the offending user process when a batch job got hung.

Consider DBA help.

Also oracle code example of a deadlock workaround:
Programming applications to handle deadlocks