Command not found in shell script - stumped for 4 days

Hello,

I like to begin with :wall:.. literally... It has been 4 days and I have no idea how to fix it.

Environment - AIX 5.3

I wrote a script to call on ssh to log into another box via PKA to do something else.

If I run the script on the terminal, it works 100%. If the SAP customised program calls this script to do the same thing, it works 50% of the time. The other 50% of the time, i get an error.

The problem is that /bin/ssh is sometimes "missing". Weird but true... and amazing...

I've tried specifying PATH, quoting /bin/ssh, and alot of other stuff.....

#!/bin/sh
PATH=/bin:/sbin:/usr/bin:/usr/sbin:$PATH; export PATH
SSHCMD="/bin/ssh"

#SSH via PKA login is used to access FSI server
LOGFILE=/var/logs/Log.`date +%d%m%y.%H%I%S`
FILE=$1

printf "Received command to start process\nVariable passed is '$FILE'\nUser ID running this program is
`whoami`\n"  > $LOGFILE

#Log into GROUP server via SSH and call second command
printf "\nLogging into GROUP server via SSH\n" >> $LOGFILE

ls -l $SSHCMD >> $LOGFILE 2>&1
$SSHCMD -l AdminX 1.1.1.99 "cmd /c D:/Scripts/DoSomething.bat '$FILE'" >> $LOGFILE 2>&1

printf, echo, ls commands all worked fine 100%. It stops (and only sometimes) because it seems like it could not find ssh...

Log file correctly shows the printf and echo commands but ends with the below

/opt/script/encrypt.sh[15]: /bin/ssh:  not found.

I have been trawling the net for days hoping to understand why and have ran out of time.

The bit I wanted to understand most is that why is this script only failing sometimes, but worked other times.

Many thanks in advance.

J Phang

---------- Post updated at 11:44 PM ---------- Previous update was at 11:41 PM ----------

I have also tried to od -xc the log file to be sure that it was not due to characters that were unreadable/invisible. It shouldn't be as well, since such a problem would lead to an error all the time...

Thanks
J Phang

Is /bin/ssh a file, or a link? If it's a link, does it maybe point to a network mount that's not always available?

Hello,

Many thanks for replying.

I've checked that..

Both /usr/bin/ssh and /bin/ssh are files with r-x permissions for owner, group and others.

/usr is a mount point but /bin is directly under root.

In my case, that wasn't the problem.

Thanks anyway
Cheers
J Phang

Can it be that you are using application servers and that SAP sometimes runs the script on one server and sometimes on another?

1) Maybe you have reached the maximum number of open files on your system and need a kernel rebuild.
Particularly vulnerable if you job is running at the same time as something large like a backup.

2) Maybe backup software is locking the file. (long shot)

3) I prefer pludi's idea. Automounter involved?

---------- Post updated at 16:26 ---------- Previous update was at 16:19 ----------

Also check the script file itself for funny characters.

sed -n l scriptfilename

Hello Scrutinizer, Thanks. We have 3 SAP Application servers. The script is on only 1 of them. In my environment, the script is kicked up every time. However, it errors out and terminates 50% of the time so this should rule out the possibility that SAP called the script on other App servers. I appreciate your help. J Phang

---------- Post updated at 03:10 AM ---------- Previous update was at 02:41 AM ----------

Hi methyl, Thank you. I did a check on the points you had brought up.
1) That's a possibility. I'll need to investigate the number of open files in the day. It sounds logical because the script worked early in the morning before lots of people started arriving for work, it worked occasionally during the day and then almost 100% after 5 pm in the afternoon when most people got off.
2) Our backup software runs against a cloned copy of the disks via symclone. Plus the error which says that the ssh binary isn't found instead of other errors. Less likely this could be it.
3) I've checked again and can confirm that /bin is directly under root and not mounted. Autofs is inactive. Also, if I run this script over a SSH session, it'll work 100%.
4) I've ran an equivalent command over previously "cat -vet" to check for quite the same thing. The command ran when I kicked it off interactively so it shouldn't be a character issue.
Thanks for the suggestions. At least I know what else to check for tomorrow. Being in Asia means it's 3.10am now. My apologies if I couldn't reply more quickly.
J Phang

This error can happen if a script is executed with a #! line that references a non-existent shell. Like this:

$
$ cat fubar
#! /no/such/file

exit 0
$
$
$ ./fubar
ksh: ./fubar: not found [No such file or directory]
$

./fubar is there. It's /no/such/file that is missing. Very misleading error message though.

Could it be a problem with LIBPATH perhaps load of zib or libssl is failing and causing the ssh binary to not execute.

Hello, Thanks for replying. Initially, I thought it was a problem with my shell script that was originally written in csh. I then modified it to to Bourne shell. Both /bin/sh and /usr/bin/sh are there as files and not links. The 2nd hint was it tells me exactly which line in the script caused the "not found" error. It was at the line where /bin/ssh was called. I did further tests by doing a "file /bin/ssh" within the script and that too gave a problem (and again, only sometimes). The main difference is that when I ran the script in an interactive shell, there is no problems at all. And the problem occurs if SAP program calls this script, and for that matter, fails about 30% of the time. Regards J Phang

---------- Post updated at 07:10 PM ---------- Previous update was at 07:06 PM ----------

Hello, Thanks. In this case, it didn't get as far as executing the binary. It went splat when the script looks for ssh in /bin/. Thanks anyway J Phang

is SAP running in a chroot jail? if so you may be looking at the wrong /bin/ssh

Hi, I will need to find out how to determine whether it is chroot or not. Thanks. It didn't strike me. Now to do some homework. Cheers J Phang

Check the inode of / to detect chroot. "ls -li /".