problem with spaces and argument parsing

[Preliminaries: the discussion below will refer to a Java class named HelloWorld. Its totally irrelevant to this posting what that class does, since the problem that I describe has to do with the shell, not Java. Its just that the problem arose for me while doing some coding, and I could not think of a more universal illustration.

Nevertheless, if you need something concrete in order to duplicate my actions, then the source file HelloWorld.java that I used contained the text

public class HelloWorld {
	public static void main(String[] args) {
		System.out.println("Welcome, master");
	}
}

and I compiled using
javac HelloWorld.java
]

Suppose that I execute the following command directly from the shell:
java -XX:OnError="gdb - %p" HelloWorld
Then it works perfectly fine, as expected.

Now suppose that I create the following shell script:

#!/bin/sh
echo java  -XX:OnError=\"gdb - %p\"  HelloWorld
java  -XX:OnError=\"gdb - %p\"  HelloWorld

The output now is:
java -XX:OnError="gdb - %p" HelloWorld
Unrecognized option: -
Could not create the Java virtual machine.

OK, the echo line outputs exactly what was previously executed directly from the shell, but the script's attempt to execute it fails. The error message hints that the
"gdb - %p"
is actually being interpreted as 3 different words for some strange reason, namely
"gdb
-
%p"

In fact, I think that I can confirm this by modifying the script to be

#!/bin/sh
echo java  -XX:OnError=\"gdb - %p\"  HelloWorld
set -x
java  -XX:OnError=\"gdb - %p\"  HelloWorld
set +x

which outputs
java -XX:OnError="gdb - %p" HelloWorld
+ java '-XX:OnError="gdb' - '%p"' HelloWorld
Unrecognized option: -
Could not create the Java virtual machine.
+ set +x
Notice how set -x puts the first pair of single quotes around just
-XX:OnError="gdb
and then the next pair around
%p"

So what is the shell script doing in its interpretation of the text that the shell command line does not do that is causing this to fail?

I guess that shell scripts are not exactly like saved command line sessions after all?

Could I work around this by using some sort of octal escape or something for the spaces inside "gdb - %p"?

The shell is interpreting

java  -XX:OnError=\"gdb - %p\"  HelloWorld

as the following list of arguments....

0: java
1: -X:OnError="gdb
2: -
3: %p"
4: HelloWorld

The question is what do you actually want -XX:OnError to be?

Sorry if my inital psot was unclear. I want
-XX:OnError="gdb - %p"
to be interpreted as a single argument, not as 3 separate arguments as is currently happening.

I should have mentioned that I tried guessing and putting single and double quote marks around it, all to no avail. This is a very frustrating aspect of sh files compared to conventional languages like Java; there are so many gotchas that you feel like you are standing on quicksand sometimes.

Try escaping the spaces that you want the shell to ignore....

-XX:OnError=\"gdb\ -\ %p\"

or quoting...

-XX:OnError="\"gdb - %p\""

Does that really escape the spaces? I thought that the backslash only worked for certain subsequent chars (e.g. another \, or an n, etc).

Regardless, I tried it and it does not work.

I also tried using the octal code \040 instead of the space and the command now executes, but it uses the literal chars "\040" so that if a java error ever occured then it would fail at that point.

I had previously tried quotes like what you suggest, in both single and double quote versions, and both fail for me.

The \ escaping works for me on Solaris 9

C program compiled as a.out

#include <stdio.h>

int main(int argc,char **argv)
{
        int i=0;
        while (i < argc) { printf("argv[%d]=%s\n",i,argv); i++; }

        return 0;
}

script...

#!/bin/sh

./a.out java "-XX:OnError=\"gdb - %p\"" HelloWorld

output

argv[0]=./a.out
argv[1]=java
argv[2]=-XX:OnError="gdb - %p"
argv[3]=HelloWorld

similarly...

#!/bin/sh -x

./a.out java -XX:OnError=\"gdb\ -\ %p\" HelloWorld

gives

argv[0]=./a.out
argv[1]=java
argv[2]=-XX:OnError="gdb - %p"
argv[3]=HelloWorld

Maybe this is a unix variant problem.

On cygwin, when I use

\"gdb\ -\ %p\"

I see a single space (" ") in echo's output. But the java command fails because the script is still interpreting the above line as 3 separate arguments instead of a single one, so the java command thinks that it is receiving a bogus option named "-".

On linux (2.6.9-022stab070.9-enterprise), the echo output is again correct, but the overall command fails due to a different error. Hmm...

I posted the last reply too soon.

If I go back to my original posting, and use this as an sh file:

#!/bin/sh
echo java  -XX:OnError=\"gdb\ -\ %p\"  HelloWorld
java  -XX:OnError=\"gdb\ -\ %p\"  HelloWorld

then everything works: the echo prints correctly, and the java command fully executes.

Great!

But what I really want in my script is for options like the gdb thing to be env vars that can be reused with multiple java invocations. The simplest version of what I really want my build script to look like is

#!/bin/sh
option=-XX:OnError=\"gdb - %p\"
echo java  $option  HelloWorld
java  $option  HelloWorld

This gives the output
t.sh: line 2: -: command not found
java HelloWorld
Welcome, master
Its clearly not fully correct because it is not using option.

If I change option's line to use escaped spaces

option=-XX:OnError=\"gdb\ -\ %p\"

the output is
java -XX:OnError="gdb - %p" HelloWorld
Unrecognized option: -
Could not create the Java virtual machine.
which is again problematic.

This, fortunately, is readily corrected by using quotes around option:

#!/bin/sh
option=-XX:OnError=\"gdb\ -\ %p\"
echo java  "$option"  HelloWorld
java  "$option"  HelloWorld

which actually fully works.

Unfortunately, what I really want is a more complicated script that would look like this

#!/bin/sh
errorHandling=-XX:OnError=\"gdb\ -\ %p\"
gcType="-XX:+UseParallelGC  -XX:+UseParallelOldGC"
standardJavaOptions="$errorHandling  $gcType"
echo java  $standardJavaOptions  HelloWorld
java  $standardJavaOptions  HelloWorld

Here, I want to build up top level env vars from smaller env vars. Unfortunately, the above script fails again with the usual error ("Unrecognized option: -"). It seems as if every $ substitution causes quoting space nightmares to reappear. This makes it difficult or impossible to sanely build up top level env vars from smaller env vars?

The above results were obtained on both cygwin and linux.

Alas yes, an option is to write a small utility shell script which will re-escape it's input...

Also you may want to try using "make" rather than shell to do this and see if it's substitution rules work better.