Memory fault(Coredump)

Hi,

In my application we have one job which is used to process the files. But that job is failing with memory fault while processing a file or while shutting down the job. Sometime it generates the coredump and sometimes not. When I analysed the core dump I got below code snippet where it failed,

if ( addlUsage.pParamVal != NULL )
      {
                free (addlUsage.pParamVal);
                addlUsage.pParamVal = NULL;
      }

It failed while trying to free memory 'free (addlUsage.pParamVal);'.

When I analyzed the addlUsage.pParamVal value then I got one junk character at the end. So I added the \0 at the end after allocating the memory for it (o_pADUbuffer in below code is addlUsage) and then copied required value to addlUsage.pParamVal . Please see below,

lenOfVarADU = sizeOfAddlUsage - sizeOfFixedAddlUsage;
o_pADUbuffer->pParamVal = (char *)malloc((sizeOfAddlUsage - sizeOfFixedAddlUsage) + 1);
o_pADUbuffer->pParamVal[lenOfVarADU] = '\0';
 
memcpy(o_pADUbuffer->pParamVal, temp_buf + inputBuffPosIndx,
                   (sizeOfAddlUsage - sizeOfFixedAddlUsage));

But still I am getting same issue.

Please help....Thanks in advance.

No.

Your code has done pointer arithmetic on the pointer object you are trying to free.
malloc() keeps data about the pointer adjacent to the actual place in memory the pointer references at the start, so free() thinks it has to work on memory the process does nto own. Boom.

Use Electric Fence or valgrind to find where. These are available for a lot of platforms.

Failing that look at driver's code here:

Actually this problem is coming in production environment and is not reproducible in the test environment. I cannot debug the code in production environment. And as it is not reproduced in test environment, there is no use to debug the code there. In this case what can be done?

---------- Post updated 07-06-12 at 12:21 AM ---------- Previous update was 07-05-12 at 11:26 PM ----------

One more thing, we upgraded our system fex months ago and then only this issue is coming. Before everything was fine. I checked the code before and after upgradation and there are minor changes. also this issue is coming on HP-UX machine and not on Solaris machine.

Yes, abusing memory like this means your code will be subject to inexplicable crashes in some places but not others, debug compiles but not release ones, and such and like because whether the memory you're accidentally trashing is
a) valid
b) occupied by anything that will cause the program to crash later

...is system-dependent, compiler-dependent, and environment-dependent.

That doesn't mean they don't all have the bug. They're all trashing memory they're not supposed to.