Newline in ANSI-C standard functions

Can someone outline the "best practice" (if any!) to handle newline in ANSI-C standard library functions?
I had some confusion with these functions recently related to char array and char pointer.
puts(), printf(), strcpy(), strncpy(), memset().
I seem to understand their basic use, but got quite confused on some situations.

#include <stdio.h>
#include <string.h>

//Some said don't use strcpy() but use strncpy()!!!
int main () {
   char str1[128];
   char str2[256];
   int len1, len2;
//   strcpy(str1, "this is string1");
//   strcpy(str2, "That is test string2");
   len1=strlen("this is string1");
   len2=strlen("That is test string2");

   //memset(str1, '\0', sizeof(str1));   //why is it needed?
   strncpy(str1, "this is string1", len1);
   strncpy(str2, "That is test string2", len2);
   puts("___Line 1___");
   puts(str1);
   puts("___Line 2___");
   puts(str2);
   return (0);
}
 

1) For unknown reason, sometime I got output:

this is string10That is test string2
That is test string2.

With the memset() function inserted, I always got the correct output. But, I do not see memset() is combined with these strncpy()/strcpy() very often.
2) The 4 puts() are in a row, but sometime they gave extra blank line that seems come from extra newline (especially in the next example), but I read puts() only appends newline at the end.
Here is another example where I tested with printf() and puts():

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

//Tried from scratch combining const/strlen/malloc/'\0' concepts

void swap(char* s1, char* s2) {
         *s1 = *s1 ^ *s2;
         *s2 = *s2 ^ *s1;
         *s1 = *s2 ^ *s1;
}

char* str_reverse_in_place_by_swap(char* str){
    int len = strlen(str);
    char* start = str;
    char* end = start + len - 1; //-1 for '\0'; Made a mistake: char* end = start + len 
;
    while(start < end )
    {
        swap(start, end);
        ++start;
        --end;
    }
return str;
}


int main(int argc, char *argv[])
{
  char* string1 = malloc(256*sizeof(char));    //first allocate 256 bytes long

  printf("Type a String to reverse it[max. 255 chars]:\n");

  if (fgets(string1, 256, stdin) == NULL)
      printf("Error! Non-empty string is needed!\n");

  char* str3 = malloc( (strlen(string1)+1) * sizeof(char) );
  printf("Before reverse string1(which should be empty!): %s\n", str3);

  printf("By reverse_in_place_by_swap():");
//  str3 = str_reverse_in_place_by_swap(string1);
//  puts(str_reverse_in_place_by_swap(string1));
  printf("%s\n", str_reverse_in_place_by_swap(string1));
  free(str3);

  return EXIT_SUCCESS;
}

I must have missed important rules to use these functions.
so I'm asking my question with "best practice" to narrow down my struggle if that is reasonable. Thanks.

None of these do anything weird with newlines at all.

Blindly using anything without knowing what it's for or what it does will cause problems, as it has here. strcpy would have worked.

strncpy doesn't want the size of the origin string: It wants the size of the destination buffer.

And len1 is shorter than the origin string, anyway. strlen gives you the length without the NULL terminator, forcing strncpy to cut off the terminator, leaving it copied unterminated showing whatever garbage might have been left behind it. If you'd used strcpy, you would have been safe.

Again, random garbage left afterwards from a string that didn't get terminated.

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

//Tried from scratch combining const/strlen/malloc/'\0' concepts

void swap(char* s1, char* s2) {
         *s1 = *s1 ^ *s2;
         *s2 = *s2 ^ *s1;
         *s1 = *s2 ^ *s1;
}

char* str_reverse_in_place_by_swap(char* str){
    int len = strlen(str);
    char* start = str;
    char* end = start + len - 1; //-1 for '\0'; Made a mistake: char* end = start + len 
;
    while(start < end )
    {
        swap(start, end);
        ++start;
        --end;
    }
return str;
}


int main(int argc, char *argv[])
{
  char* string1 = malloc(256*sizeof(char));    //first allocate 256 bytes long

  printf("Type a String to reverse it[max. 255 chars]:\n");

  if (fgets(string1, 256, stdin) == NULL)
      printf("Error! Non-empty string is needed!\n");

  char* str3 = malloc( (strlen(string1)+1) * sizeof(char) );
  printf("Before reverse string1(which should be empty!): %s\n", str3);

  printf("By reverse_in_place_by_swap():");
//  str3 = str_reverse_in_place_by_swap(string1);
//  puts(str_reverse_in_place_by_swap(string1));
  printf("%s\n", str_reverse_in_place_by_swap(string1));
  free(str3);

  return EXIT_SUCCESS;
}

You just confused yourself with pointer logic. "=" does not copy the contents of a pointer. It alters the pointer!

str3 ends up pointing to the exact same memory as string1 pointed to.

1 Like

Thank you so much for your patience with my basics still, as I feel getting better to understand more.
strncpy doesn't want the size of the origin string: It wants the size of the destination buffer. ...... If you'd used strcpy, you would have been safe.
To see how things are working in the memory in order to understand those garbage left, I got error in the Block Two of following code: ***buffer overflow detected ***: ./memset01 terminated

#include <stdio.h>
#include <string.h>

//More about that some said don't use strcpy() but use strncpy()!!!
int main () {
   char str1[128];
   char str2[256];
   int len1, len2;
   len1=strlen("this is string1");
   len2=strlen("That is test string2");
   printf("Length of str1[128]: %d\n", len1);
   printf("Length of str2[256]: %d\n\n", len2);

   strcpy(str1, "this is string1");
   strcpy(str2, "That is test string2");

   printf("Length of strlen(str1): %lu\n", strlen(str1));
   printf("Length of strlen(str2): %lu\n\n", strlen(str2));

/*Start of block One***********************/
   memset(str1, 'X', strlen(str1)+113); // fills the first 128 bytes of the memory area pointed to by str1 with the constant byte 'X'.
   memset(str2, 'X', strlen(str2)+108); // fills the first 128 bytes of the memory area pointed to by str2 with the constant byte 'X'.

   strncpy(str1, "this is string1", len1);
   strncpy(str2, "That is test string2", len2);
   puts("___Line 1___:");
   puts(str1);
   puts("___Line 2___:");
   puts(str2);
   printf("------------------------------------------------------------------\n");
/*End of block One***********************/

/*Start of block Two***********************/
   memset(str1, 'X', strlen(str1)+113); //fills the first 128 bytes of the memory area pointed to by str1 with the constant byte 'X'.
   memset(str2, 'X', strlen(str2)+108); //fills the first 128 bytes of the memory area pointed to by str2 with the constant byte 'X'.
   strcpy(str1, "this is string1");
   strcpy(str2, "That is test string2");

   puts("___Line 1___:");
   puts(str1);
   puts("___Line 2___:");
   puts(str2);
   printf("------------------------------------------------------------------\n");
/*End of block Two***********************/
   return(0);
}
 

If I switch the two blocks, there is no buffer overflow for the memset().
What is going on in the memory? Thanks!

You are depending on the existence of a terminating NUL. You write 128 total and leave no room in the shorter string str1. You leave no terminating NUL. strcpy printf and lots of other functions depend on the existence of correclty terminated strings. Your sis not.

memset does not care where the end of a string is. Make str1[129], then write less than a total of 129 to str1. Or write fewer characters total to str1.

You are getting undefined behavior. That means I cannot know what your environment is doing or has done. You cannot ever know either. It could crash the program, compose a poem, or order a pizza.

I am guessing. If str1 precedes str2 in memory and each one is word-aligned, then the last character you write to str1[128] effectively cause the end of of str1 to be the actual end of str2. As far as strcpy, printf and so on are concerned.

So, when you exchange order or the two variables you change observed behavior.

2 Likes

Stop that! That's not how you use strncpy! And also not why people use it.

Blindly using strncpy because people call strcpy "bad" is worse than just using strcpy in the first place. You are not correcting the risks strcpy allows (unlimited input lengths despite limited buffer size) and causing problems strcpy didn't have had in the first place.

Grasp the basics first and you'll understand what they're talking about.

1 Like

Thanks Jim and Corona:
I was not sure the NULL terminator was handled correctly.
If str1 precedes str2 in memory and each one is word-aligned, then the last character you write to str1[128] effectively cause the end of of str1 to be the actual end of str2. This is quite twisting to me!! Now my understanding come to this:

1) Doing strncpy() first caused no NUL terminator, and gave me buffer overflow and strcpy() could not run at all.
2) Instead, doing strcpy() first the correct NUL terminator is ensured, then the following strncpy() seems working which still does not provide NUL terminator. But, because the program exits, just the problem did not show up.

Is this correct?
Blindly using strncpy because people call strcpy "bad" is worse than just using strcpy in the first place.

I do not know the risk of strcpy() or the correct use of strncpy(). They just happened to come to my exercise. I thought figure out the details may help understanding what's going on in the memory, which is why I tried memset(). Thanks again.

The risk of strcpy is this:

char buf[8]; // Only room for 8 characters
char very_important_variable_touch_and_the_world_explodes=42;

// More than 8 characters, where does the rest go?
strcpy(buf, "HEY GUYS ALJ AF MY FACE IS A ROTTORN BANANA");

So instead, you can do this:

char buf[8]; 
char very_important_variable_touch_and_the_world_explodes=42;

strncpy(buf, "HEY GUYS ALJ AF MY FACE IS A ROTTORN BANANA", 8);

Which sets the contents of buf to { 'H','E','Y',' ','G','U','Y','S' } and doesn't destroy the world. But because it ran out of room, it couldn't store the null terminator. Meaning buf[] is not guaranteed to contain a proper string after you do strncpy, so trying to use it as a string could cause undefined results and possibly blow up the world. So to be really safe, you do:

char buf[8]; 
char very_important_variable_touch_and_the_world_explodes=42;

strncpy(buf, "HEY GUYS ALJ AF MY FACE IS A ROTTORN BANANA", 8);

buf[8-1]='\0';

Which sets buf to { 'H','E','Y',' ','G','U','Y','\0' }, which guarantees the buffer is null-terminated whether strncpy ran out of room or not, and does it all without destroying the world.

2 Likes