Segmentation fault when I pass a char pointer to a function in C.

jose_spain · April 21, 2018, 9:21am

I am passing a char* to the function "reverse" and when I execute it with gdb I get:

Program received signal SIGSEGV, Segmentation fault.

0x000000000040083b in reverse (s=0x400b2b "hello") at pointersExample.c:72
72        *q = *p;

Attached is the source code.

I do not understand why this error occurs.

Why "modifyStruct" and "modifyString" are working right, but "reverse" does not work?

SOURCE CODE:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


struct book
{
  char* title;
  int   pages;
  float price;
};


void modifyInt (int *li)
{
  *li *= 2;
}


void modifyStruct (struct book* lb)
{
  char* newTitle = "The best book";

  free(lb->title);
  lb->title = malloc((strlen(newTitle) + 1) * sizeof(char));
  strcpy(lb->title, newTitle);

  lb->pages=350;
  lb->price=20.25;
}


void modifyString (char* lw)
{
  int i;
  int strLen = strlen(lw);

  for (i=0; i<strLen; i++)
    if( (*(lw+i) >= 97) && (*(lw+i) <= 122) )
      *(lw+i) -= 32; // Converts letters to upper case.
}


void reverse (char* s)
{
  char* aux;
  char* p;
  char* q;

  p = s;
  int len = strlen(s);
  aux = (char*) malloc ((len + 1) * sizeof(char));

  printf("s=%s\n", s);

  printf("strlen(s)=%d\n", len); 
  *(aux+len) = 0; // End string mark.
  len--; 
  while (*p != 0)
  {
    *(aux+len) = *p;
    len--;
    p++;
  }
  printf("aux=%s\n", aux);

  // Copy aux to s.
  p = aux;
  q = s;
  while (*p != 0)
  {
    *q = *p;
    p++;
    q++;
  }

  printf("aux=%s\n", aux);
  printf("s=%s\n", s);
}


int main()
{
  int iv = 10;
  struct book *mynewbook;
  int len;
  char* words1;
  char* initial_words = "hello_pleased_to_met_you" ;
  char* tmp = "hello";

  printf ("\niv = %d\n", iv);
  modifyInt(&iv);
  printf ("After calling modifyInt, iv = %d\n\n", iv);

  mynewbook = (struct book*)malloc (sizeof (struct book));
  if (mynewbook == NULL)
    return -1;

  len = strlen("Unknown"); 
  mynewbook->title = (char*) malloc ((len + 1) * sizeof(char));
  strcpy(mynewbook->title, "Unknown");
  mynewbook->pages=0;
  mynewbook->price=0.0;
  printf ("mynewbook: title = %s, pages=%d, price=%f\n",
           mynewbook->title, mynewbook->pages, mynewbook->price); 
  modifyStruct(mynewbook);
  printf ("After calling modifyStruct: mynewbook: title = %s, pages=%d, price=%f\n\n",
            mynewbook->title, mynewbook->pages, mynewbook->price); 

  len = strlen(initial_words); 
  words1 = (char*) malloc ((len + 1) * sizeof(char));
  strcpy(words1, initial_words);
  printf ("words1 = %s\n", words1);
  modifyString(words1);
  printf ("After calling modifyString: words1 = %s\n\n", words1);

  printf("tmp=%s\n" , tmp);
  reverse(tmp);
  printf("After calling reverse: tmp=%s\n\n" , tmp);

  return 0;
}

jim_mcnamara · April 21, 2018, 9:37am

while (*p != 0)
  {
    *(aux+len) = *p;
    len--;
    p++;[/red]
  }

You never set p back to the start of the string. I don't see where you call free() which you should learn to do. I just gave this code a quick look.

And therefore:

May I suggest something that will make your efforts easier?
There are several string functions that live in the

<string.h>

header file: strdup is one.
You should be using those functions, not rolling your own, given the way you have written your code.
Try:

more /usr/include/string.h

to locate some interesting library functions,
then read then man page for

strdup

and some other very helpful C library functions

Learn about strcpy, strstr, strchr, strdup - there are several other good ones to know, too. You decide.

jose_spain · April 21, 2018, 10:16am

Thanks, but I want to understand pointers use and this is the reason that I do it with a pointer.

I have done new tests with this program and I have found that:

1) If tmp is declared and initialized in main so, it works right without changes in "reverse" function:

char* tmp;

  tmp = (char*) malloc ((strlen("hello") + 1) * sizeof(char));
  strcpy(tmp, "hello");
  printf("tmp=%s\n" , tmp);
  reverse(tmp);
  printf("After calling reverse: tmp=%s\n\n" , tmp);

The program prints:

tmp=hello
After calling reverse: tmp=olleh

2) If instead I do this:

char* tmp = "hello";

  printf("tmp=%s\n" , tmp);
  reverse(tmp);
  printf("After calling reverse: tmp=%s\n\n" , tmp);

Then I get the segmentation fault.

What is the difference?

In this second case, If I check the value of s in reverse, it is:
"hello" + 0 (end of string mark)

Don_Cragun · April 21, 2018, 2:11pm

The initializer in char* tmp = "hello"; is a string constant. The C compiler is allowed to store string constants in read-only memory. If you want to overwrite a string, that string cannot be a string constant.

dodona · April 21, 2018, 5:20pm

you can't do that, because gcc places the literal "hello" of char* tmp = "hello" in a read-only data segment, but char tmp[] = "hello" not.

dryden · April 22, 2018, 6:06am

I am scared of Don Cragun, because he knows everything ;-).

Yes, interesting. So because that array stores actual memory on the stack, you cannot change what it points to, hmmmm. I thought arrays were pointers, until I tried to assign an array (pointer) to something else ;-). [I mean the reverse, assign something else to that array].

Don_Cragun · April 22, 2018, 10:43am

Don't be scared of me! I don't know everything, as I have unfortunately proven in earlier posts in this forum (but I do try to admit when I make mistakes). :o

Let me expand a little on what dodona and I have said in earlier posts...
Inside a function definition (such as in main() shown in post #1 in this thread), the declarations in main() :

int main()
{
  int iv = 10;
  struct book *mynewbook;
  int len;
  char* words1;
  char* initial_words = "hello_pleased_to_met_you";
  char* tmp = "hello";
  ...
}

create:

an integer named iv on the stack and initializes it to the value 10,
a pointer named mynewbook that can be used to access a structure of type book but does not allocate any space for a structure of that type and the value assigned to that pointer will be any random value found on the stack where that pointer is located,
an integer named len containing whatever random value is located on the stack at the address assigned to that integer,
a pointer named words1 that can be used to access an object of type char that points to a random address depending on whatever value is located on the stack at the address assigned to that pointer,
a pointer named initial_words that can be used to access an object of type char that points to the first character of the string "hello_pleased_to_met_you" which might be located in read-write memory on the stack, in read-only memory that is not located on the stack, or in read-write memory that is not located on the stack, and
a pointer named tmp that can be used to access an object of type char that points to the first character of the string "hello" which might be located in read-write memory on the stack, in read-only memory that is not located on the stack, or in read-write memory that is not located on the stack.

With most modern compilers the strings mentioned in points 5 and 6 above will be located in read-only memory and will, therefore, generate a segmentation fault if you try to change the data in those strings.

Early C compilers (in the 1970's and 1980's) frequently put these arrays in read-write memory. And when you had code that tried to overwrite these strings, they succeeded. This had the side-effect of turning string constants into variables whose constant string values were not constants while the process was running.

To create an array of characters on the stack that can be read and written instead of a pointer on the stack that points to an array of characters that might be read-only, you need to use a declaration more like:

  char initial_words_array[25] = "hello_pleased_to_met_you";
  char tmp_array[6] = "hello";

Both of these create arrays of characters on the stack. The constant string initializers will be copied into these arrays (on the stack) every time the function is invoked.

Arrays of characters and pointers to characters are two very different things. An array of characters has a size that is the number of characters that can be stored in it. A pointer to a character (or a pointer to an array of characters) has a constant size (usually 4 bytes per pointer on a system with a 32-bit address space or 8 bytes per pointer on a system with a 64-bit address space). You can increment a pointer to point to the next element in the array to which it points. You can't increment an array (although you can increment elements of an array). Although an array name is not a pointer, C allows an array name used without following square brackets to be used as a synonym for the address of the first element of that array. So, if I had the declarations:

  char *tmp;
  tmp_array[6] = "hello";

then both of the following lines of code set the pointer tmp to point to the h in the string hello :

  tmp = &tmp_array[0];
  tmp = tmp_array;

To then update the pointer to point to the next character in the array, you can use any of the following lines of code:

  tmp++;
  ++tmp;
  tmp = tmp + 1;
  tmp = &tmp_array[1];
  tmp = tmp_array + 1

but you can't use either of:

  tmp = tmp_array++;
  tmp = ++tmp_array;

because tmp_array is an array; and an array is not a pointer type.

dryden · April 22, 2018, 11:34am

Yes, I knew most of that. After all, I have Java experience (just kidding).

Well I knew most of that (except the history) -- I didn't know that once it was different.

It's difficult to assess because "segmentation fault" is really about the only error message you ever get :-/.

I'm lucky to know it means "page fault" -- the first time it happened to me I had to think about what it could mean and then I figured it must be read only,

lately I found out that the "const" keyword does about the same thing, but only on the compiler level this time -- but I'm changing the subject a bit I guess :p.

Yes, but I often think "An array is a pointer" to remember that I shouldn't use &array to get its address.

But I'm happy to learn this workaround, although I prefer to use strdup("string") instead..... although.... and strdupa also has its own caveats.

Funnily, in Java Strings are also immutable, and you have to use StringBuffer (or StringBuilder) instead to change them. Because of that reason, porting from C to Java is sometimes less strange than you would expect.

I'm actually quite interested to start using that garbage collector though (libgc1c2), although the number of applications (on Linux) that actually uses it is rather limited, only Inkscape depends on it on my system.

Anyway, enough.

Corona688 · April 23, 2018, 6:09pm

It's implicitly a pointer, implemented as one internally, but the same way you're not allowed to change a string's contents, you're not allowed to change its value. Hardcoded.

dryden · April 24, 2018, 2:25am

Those... aren't really similar things. The array's address (the name) is only known at compile time, or debug info, but in the compiled code the 'name' is effaced and you're simply dealing with addresses hardcoded into the code.

Thus, there isn't really anything you can modify unless you were to write the code segment.

The string, I guess it could also be located in a code segment, but more likely is that they are actual data values in a read-only data segment.

As such, the array is not really a pointer, there is no variable anywhere holding its address. So, it's no different from not being able to change the address of some int value that you have defined.

When you do

char arr[20]; char *p = arr;

That's technically no different from

int i; int *q = &i;

But I find the biggest annoyance(?) of arrays and structs to be that you can only initialize them at declaration?