production issue - shell sqlplus processing sometime success sometime fail

Hi Expert,

Below is a real production environment issue:

we are using shell script to FTP to a remote server and fetch around 150 files every day, for each file we need to keep a entry inside ORACLE DB table, before insert into table, each file has a associated logid, which need to be initialed from DB side, so in summary, for each file we need two SQLplus communication, one is to initial log and another is to keep entry.

Now the problem is sometimes all the files successfully kept entries in DB, sometimes failed with none of them been kept in DB entries.

Our shell script structure is as below:

initial some configuration items ....
sqlplus db to get the configuration for remote server information
base on the configuration to get the remote file list
call another FTP script to pull files back to a well-known diretory
for each file under this diretory, loop and keep db entries

in order to reduce the DB connection, we are using below to only open one DB connection

SQLPLUS -S user/pass@service |&

then use print -p and read -p to communicate

Now we are wondering whether it is due to the DB PROCESSES parameter or we have any other potential issues not aware yet.

Attached is the main shell script.

Hope someone can help us figure out the root cause, we have been suffering this issue on production for a long time. Even worse thing is that we can not reproduce it on testbed and even production-like testbed.

Can't we (you) add the logging in the script?
There are couple of instances where conditional exit has been used. also for some cases there is no message before the exit.
Additionally, sqlplus output is also redirected to /dev/null.

It is really hard to guess what can go wrong, as it may be ftp part, may be sqlplus session etc.

AFAIK, the best way to check is to redirect the stdout and stderr to a file (before that it is good to put the message before all the exit statement and don't nullify any of the stdout/stderr inside the script ).

We can still wait for expert's replies.

Can you check if there are any env different between prod and non-prod env?

for example,run "env" to get all setting.

Second, do you run it by /bin/ksh -x your script? with that, you can know where it is stopped or where is the problem. Paste the error here

Third. "set -x" is useful in your script debug. Turn it on, and test by ksh -x again.