Application behaving in 3 different ways on 3 different machines

erupter · January 7, 2013, 6:51am

Hello.
During the holidays I've been developing an application on my desktop computer at home.
I setup a repository on github, so when I got back to work I cloned the repo to my laptop.

It wouldn't work.
The app is comprised of a client and a server, strangely enough the server would segfault at a strcpy at the very beginning, while the client would bug me about not supplying a command line parameter (it's supposed to work anyway).

So I ssh-ed into an office machine we use to test things out, cloned the repo and the problems are inverted!
Now it's the client that would segfault while the server pretends a parameter!

My machines are
desktop - i7 2600k with Ubuntu 12.04 x64 eng
laptop - core2duo with Ubuntu 12.04 x64 eng
test pc - core2duo with Ubuntu 10.04 ita (dunno if x86 or x64)

Now I could bear that an app developed on a single pc would require some tinkering on other pcs, but the same app displaying exactly symmetrical behaviour on two different pcs I can't understand.

Anyway the specific code that seems to be the problems is the following

struct arguments
{
  int *Z_DEBUG, *M_DEBUG;
  char * interf;
  char * outfile;            /* Argument for -o */
};

int main(int argc, char** argv) {

    struct arguments arguments;
    outstream = stdout;
    arguments.M_DEBUG=&MAIN_DEBUG;
    arguments.Z_DEBUG=&ZMQ_DEBUG;
    strcpy( arguments.interf, "eth0" );
    arguments.outfile = NULL;
    s_catch_signals();

    argp_parse(&argp, argc, argv, 0, 0, &arguments);

either I get a segfault at the strcpy or somehow the argp_parse exits the program.

I'm not an expert enough to understand why declaring the following is correct

char *mystring="useless phrase";

while this is wrong

char *mystring;
strcpy(mystring, "useless phrase");

And even more so I can't understand why, if it's wrong, it would work on my desktop computer!

Any help is really appreciated.

jim_mcnamara · January 7, 2013, 7:07am

The fact that your code ever worked is an accident. You have what is called undefined behavior

  char * interf;
  char * outfile;

They are pointers to a character string. They have NO memory allocated to them.

Change them like this (pick a number I chose 256...) :

  char interf[256];
  char outfile[256];

Or call

malloc()

to allocate memory for each one in main().

  interf=malloc(256);
  outfile=malloc(256);
  if(interf==NULL || or outfile==NULL)
  {
      perror("Cannot allocate memory");
      exit(1);
  }

You code did not work, it just happened not to crash. Because those pointers are never initialized they could be pointing to any bit of memory left over from a previous program. If you rebooted one of the PC's you may get different behavior. It is purely random, so it may not be something you can duplicate.

erupter · January 7, 2013, 9:16am

Thank you for your time.

I begun suspecting as much, but it is still unclear to me how it has been working for more than 10 days without so much as one crash or problem...

Corona688 · January 7, 2013, 11:10am

You are using undefined behavior.

Undefined does not mean 'crash instantly', it means 'what this does depends on the machine'.

So, it's to be expected that it will be unpredictable.

Corona688 · January 7, 2013, 11:12am

This code ought to work without changing your structures, by the way:

arguments.interf="eth0";
arguments.outfile=NULL;

...because it's not trying to copy into nonexistent memory. "eth0" itself is already a pointer to valid memory, strcpy not needed.

jim_mcnamara · January 7, 2013, 12:36pm

There is a set of standards for C. They dictate what will or will not happen in the language.

Doing what you did created something that has undefined behavior. I'll make one pass at this.
When you run a C program:

1 - the OS  creates a stack frame for main.
2 - the os simply overlays that stack on top of existing garbage in memory
3 - it does this for efficiency reasons and because that memory is no longer part of any process.
4 - when your program ran,  those pointers were parked on top of memory that had some existing values in it.
5 - what was in the memory depends on the program that lived in that exact memory before
6 - it could be all 00000000, it could literally be anything.
7 - since it could be anything, it is possible that the memory pointed to (0xfaaa0000) - let's pretend.
8 - 0xfaaa0000 just HAPPENED to be by random chance an OS allocated  location on your existing stack frame.
9 - Now we can use the memory for our program - no crash.
10 Why? because the memory is part of the process so you can what you want to it
11 What if 0xfaaa0000 was NOT part of allocated memory?  Boom, program crash.
12 Therefore there is no known way to predict the behavior of the code, it is undefined

Don_Cragun · January 7, 2013, 2:06pm

jim mcnamara:

There is a set of standards for C. They dictate what will or will not happen in the language.

Doing what you did created something that has undefined behavior. I'll make one pass at this.
When you run a C program:

1 - the OS  creates a stack frame for main.
2 - the os simply overlays that stack on top of existing garbage in memory
3 - it does this for efficiency reasons and because that memory is no longer part of any process.
4 - when your program ran,  those pointers were parked on top of memory that had some existing values in it.
5 - what was in the memory depends on the program that lived in that exact memory before
6 - it could be all 00000000, it could literally be anything.
7 - since it could be anything, it is possible that the memory pointed to (0xfaaa0000) - let's pretend.
8 - 0xfaaa0000 just HAPPENED to be by random chance an OS allocated  location on your existing stack frame.
9 - Now we can use the memory for our program - no crash.
10 Why? because the memory is part of the process so you can what you want to it
11 What if 0xfaaa0000 was NOT part of allocated memory?  Boom, program crash.
12 Therefore there is no known way to predict the behavior of the code, it is undefined

I will disagree VERY slightly with Jim concerning #3 in the above list. On any UNIX system, data allocated to one process will never be given to another process for use as the stack of a new process. Doing so would create a security hole (covert channel).

Code may be shared between processes (using shared libraries); data may be shared using shared libraries, shared memory segments, mmap()ed files, etc. But address space that will be used as data (including the stack) that is not explicitly shared, will be cleared by the OS before handing it to any user-level process.

There is a lot of code run by a process when a C program starts executing before you get to the first line of code in main(). Shared libraries have to be linked in; the locale has to be initialized; the STDIO stdin, stdout, and stderr streams have to be initialized; etc. Any of these can leave random data in what will eventually become the stack frame allocated to main(), and some of them may leave different things on the stack depending on the time/date when the program was run, the version of the OS or shared libraries being used, etc.

So, the end result is the same. Uninitialized data on the stack of main() can vary from run to run. And if you have other uninitialized pointers being used by argp_parse() or s_catch_signals(), they may also be overwriting anything in your address space and may get segmentation faults on some future execution of your code.

jim_mcnamara · January 7, 2013, 3:06pm

That is not what #3 is meant to say - that memory WAS part of a now extinct process - that is why you cannot konw the contents of memory beforehand.

Corona688 · January 7, 2013, 3:09pm

Memory that came from an extinct process gets blanked before anything else gets it.

What may be different is different arrangements of the stack frame, and the like. With the compiler options you have that days (stack protectors, etc) it can vary quite a bit.

erupter · January 11, 2013, 7:43am

I knew about undefined memory areas.
What is confusing me is when you do some implicit definitions like

char * mystring="some text";

So, as corona said, somehow my pointers should have already been initialized to some available memory area.
And this has consistently been the case on a single machine.
It consistently wasn't on another two machines.

Corona688 · January 11, 2013, 10:36am

erupter:

I knew about undefined memory areas.
What is confusing me is when you do some implicit definitions like
char * mystring="some text";
So, as corona said, somehow my pointers should have already been initialized to some available memory area.
And this has consistently been the case on a single machine.
It consistently wasn't on another two machines.

Um, no. I was telling you that what you were doing is wrong, not that what you were doing ought to work.

Let me put it this way.

"some text" becomes valid memory when compiled. That text has to get into the program somehow, right? (It's unwritable memory, fyi, so if you try and edit it you'll get a surprise.) It's not an implicit strcpy. The memory already exists and will exist for the entire duration of the program. You do not need to copy the string to have it.

The pointer itself, though? Until you make it point to anything valid, can point to valid or invalid memory, depending wholly on chance and undefined things.

So lets compare your two methods of assignment:

char *mysterymemory;
// mysterymemory points to god-knows-what when the program starts.
// Let's play memory roulette and write to god-knows-what!
// This might work, or might write 'mystring' right into the middle of my return frame,
// or even crash immediately.
strcpy(mysterymemory, "mystring");

// This pointer does not point to god-knows-what.
// This pointer points to the valid memory that "mystring" exists inside.
// The bytes "mystring" don't have to be written anywhere.
// This is not anything like strcpy.
// We are just pointing to memory that ALREADY EXISTS.
char *validmemory="mystring";