SIGSEGV when allocate a certain size

The problem: I need to work with large arrays and after one of my structures grew in size my program started getting segmentation fault.

My code where I allocate the arrays:

static  R1         *tarr;
static  R2         *rarr;
proc_init_mem()
{
  const int     t_sz = sizeof(R1) * MAX_R1;
  const int     r_sz = sizeof(R2) * MAX_R2;
        tarr = malloc(t_sz);
        rarr = malloc(r_sz);
        if(tarr == NULL || rarr == NULL)
                return(-1);
        printf("tarr sz: %i\n", trp_sz);
        printf("rarr sz: %i\n", run_sz);
        return(0);
}

When I run the program, I am getting the printouts:

tarr sz: 11280000
rarr sz: 20200000

and then program dies according to debugger somewhere in fgetc (libc) as it reads in config params. If I decrease MAX_R1 or MAX_R2 everything is fine.

I am not exactly clear which resource limitations I am breaking.
This is Ubuntu 8 with gcc 4.2.4, the program is mostly in C, with addition of C++ libraries.

The rlimit parameters are as follows:

 0 -               -1               -1 per proc. CPU limit
 1 -         16777216         16777216 largest file created
 2 -               -1               -1 max sz of data segment
 3 -         33546240         33546240 max size of stack seg
 4 -               -1               -1 largest core sz
 5 -               -1               -1 largest resident set sz (swapping related)
 6 -             8191             8191 number of processes
 7 -             1024             1024 number of open files
 8 -            32768            32768 locked-in mem addr space
 9 -               -1               -1 addr space linit
10 -               -1               -1 max file locks
11 -             8191             8191 max number of pending signals
12 -           819200           819200 max bytes per msg queue
13 -                0                0 nice priority
14 -                0                0 max realtime priority

(I just run getrlimit in loop 0 through 14 to produce this list and manually added annotations)

I tried to run ulimit -s 32760 and then execute my program, but it did not help. BTW, the size 32760 was the bigest I was able to ulimit. Also, I tried to run as root, hoping that root would not have the limitation, but the same SIGSEGV happened.

Does anyone know how to deal with this type of problem? Any insight will be appreciated.

Is your program 32-bit or 64-bit? You are approaching 32-bit address space limitations.

It is 32 bit - I base this on the fact that sizeof(int) is 4.

When you say I approach 32 bit space limit, how is so? the sizes of those arrays printed total in 31,480,000 that would be roughly 32MB. am I wrong?

sizeof(int) is often 4 in a 64-bit system, too. sizeof(long) may be a better indicator, but not always.

You are correct, I misread.

There doesn't appear to be anything wrong with your memory allocation, then. There must be a bug somewhere else in your code.

For now I declared one of the arrays as statically as.

static R1 rarr[MAX_R1]

and keep the second array allocated dynamically by malloc, the program works.

Any suggestions on how to find the root cause would be much appreciated.

I can't see your computer from here. Please post your code.

Almost certainly it's overrunning the end of the array or otherwise corrupting memory in ways that don't always crash the program.

malloc implementations often use memory descriptors (the first long word of a malloc-ed space in heap as the length of the space allocated, and the starting address). When you write past the end of the preceding object in memory, you obliterate those memory descriptors, then the result becomes undefined - usually a crash or badly trashed data.

1 Like

This is older, but your problem may be amenable to library interposition (LD_PRELOAD for example)-
Debugging and Performance Tuning with Library Interposers

Thank you for explanation. I think I pinpointed a culprit code, which happen to be part of parsing config file.

void trunc_rem_nl(char *buf)    /* remove # remarks and EOLs */
{
  char  *p = NULL;
        if(!buf)
                return;
        if(!*buf)
                return;
        if((p = strchr(buf, '#')) != NULL)
                *p = 0;
        if(!*buf)
                return;
        if(buf[strlen(buf) - 1] == '\n') buf[strlen(buf) - 1] = 0;
}

The code in red was not there, so for a line that would start with # comment the code tries to access buf[-1]. How peculiar that the error lingered there and was never causing troubles before my stack shifted as a result of arrays size growth.

BTW, the segfault happen on the consequent fgetc after the line with starting # hash mark.

Interesting.

Negative array indexes aren't specifically wrong per se, since you're allowed to do

int array[5]={1,2,3,4,5};
int *x=array+3;
x[-1]=2;
printf("%d\n", array[2]);

It's certainly not encouraged though.

Nothing to do with your stack though -- malloc() doesn't come from there.

Making malloc() larger may have ended up putting the block into a different heap, so the crashing code was more likely to hit invalid memory than to harmlessly nudge the end of your big array when loading a config file. Or could have been something else entirely. Difficult to tell.