dlclose crashing in 64bit

Hi

I have a 64bit C++ dynamic component built using Sun Forte compiler(CC) on one server.

I am opening this shared component using dlopen and checking if a particular function is defined or not. After that, when I am closing the component using dlclose, the program is crashing.

The crash occurs only on some systems, more specifically when I run the program on another system where the shared object was not built

When I do the dlclose on the system on which the shared object was built, the application does not crash ??

I get the following error message :

Fault signal: SIGBUS (10), invalid address alignment.
At instruction address 0xffffffff7e64f4a8, faulting access address is 0x4e5352532eb264cc
Symbolic location: "realloc + 0x474 [/usr/lib/sparcv9/libc.so.1]"

------ Concluding frames leaf call (n=1) ------
# 0 0x7e64f4a8 realloc + 0x474 [/usr/lib/sparcv9/libc.so.1]
[0x1,0x1f3620,0x1,0x1f3620,0x1,....]
# 1 0x7e64f444 realloc + 0x410 [/usr/lib/sparcv9/libc.so.1]

Thanks in advance

AS a guess, I would say the void *handle in dlclose(handle) is corrupted - not pointing to a valid address.

Try setting a break, or placing a print statement just before the dlclose -

printf ("void *handle= %p\n",handle);

If you have a similar one for the other time(s) your code encounters it, you can see if it has been changed.

Hi,

I had printed the value of the handle. The result is given below :

Calling dlclose , handle is : 0x7d4014c8

Fault signal: SIGBUS (10), invalid address alignment.
At instruction address 0xffffffff7e64f4a8, faulting access address is 0x4e5352532eb267cc
Symbolic location: "realloc + 0x474 [/usr/lib/sparcv9/libc.so.1]"

------ Concluding frames leaf call (n=1) ------
# 0 0x7e64f4a8 realloc + 0x474 [/usr/lib/sparcv9/libc.so.1]
[0x1,0x1f3920,0x1,0x1f3920,0x1,....]
# 1 0x7e64f444 realloc + 0x410 [/usr/lib/sparcv9/libc.so.1]

How do I find if the handle is corrupted or not ? Also, the program runs fine on some systems , on only on a few systems it crashes ??

If it is on different hardware platforms then it's possible. SIGBUS usually means alignment problems - instead of a pointer being aligned on a page boundary or a longword boundary, it's off by a few bytes. Or, more likely, it references an address that is off.

The reason I suggested corruption (or something is changing the value) is because
the code obviously worked earlier in the code - dlclose is called after you're done.
The pointer "worked" earlier.

I'm guessing you are probably overwriting the stack somewhere earlier in the code.

Now, if you suspect the compilation issue, then compile locally, and do a test run. If problems go away, then you have runtime library differences from box to box that are trashing the pointer.

See the post by driver about tracking down mysterious corruption: