Maximum buffer size for read()

Hi friends,
Hope everybody is fine. First have a look at my code, then we will talk about it.

 
$ cat copy.c
#include <stdio.h>
#define PERMS 0644 /*  RW for owner, R for group, others */
#define BUFSIZE 1
char *progname;
int main(int argc,char * argv[])
{
        int f1, f2, n;
        char buf[BUFSIZE];
        progname = argv[0];
        if(argc != 3)
        printf("Usage %s from to\n", progname);
        f1 = open(argv[1], 0);
        if(f1 == -1)
        {
        printf("Can't open %s\n", argv[1]);
        }
        f2 = creat(argv[2], 777);
        if(f2 == -1)
        {
        printf("Can't create %s\n", argv[2]);
        }
        while((n = read(f1, buf, BUFSIZE)) > 0)
        {
                printf("%c\n", buf[0]);
                if(write(f2, buf, n) != n)
                {
                printf("Write error\n");
                }
        }
        exit(0);
}

As you can see I've defined a macro bufsize 1. The value 1 is used for my buffer. My question is, what is the unit of 1, bytes, kilobytes, megabytes? Because, the file that I am reading is very lage, containing thousands of lines, how come it copies that large file into another file. Initially, I had set bufsize as 512, then i set it to 1, still the program works fine, copying the large file into another file.
Could you please help me here?

Looking forward to your wonderful replies.

Thanks in advance!

read and write both use (int file descriptor, memory buffer, size_t bytes) and they return ssize_t bytes read. You DO NOT WANT a 1 byte buffer for read except in very special cases. Modern disks have large fetch buffers, modern filesystems read and write really big chunks. All because it is a more efficient use of resources.

Example: Compellent SAN storage uses a preferred internal block size of 2 MB. The ufs filesystem has a default buffer size of 1MB.

So if you periodically read 1 byte at a time, after a while the rest of the data that came in with your original read request is flushed from the cache and has to be read again.

Look at it this way, you are asking the system to pitch 99.99999% of every read when you request 1 byte, then twiddle your vitrual thumbs for a while. Then come back for byte #2.

thanks a lot :slight_smile:

I recall a user complaining that a change to the linux kernel reduced the maximum read size to 4 gigabytes. He thought this a large inconvenience for some reason. So whatever the maximum read size is, it's going to be far larger than whatever you're doing...

Of course, it was never safe for him to assume that read() did it all in one go in the first place. Lots of things can happen. You have done correctly by checking read's return value here.

The readall function below is what corona is talking about, example code:

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/stat.h>
#include <errno.h>

void  check_null( void *x ) 
{ 
	 if( x==NULL) 
	    { perror("Fatal error"); exit(1);}
}
ssize_t readall(int fd, char *buf, size_t bytes)
 {
     ssize_t bytes_read = 0;
     ssize_t n=0;

     do {
         if ((n = read(fd,
                       &buf[bytes_read],
                       bytes - bytes_read)) == -1)
         {
             if (errno == EINTR)  // resume on INTR
                 continue;
             else
                 return -1;
         }
         if (n == 0)
             return bytes_read;
         bytes_read += n;
     } while (bytes_read < bytes);
     return bytes_read;
 }

 // example use:

size_t filesize(const char *fname)
{
	  struct stat st;
	  if(stat(fname, &st) == -1)
	  {
	  	 fprintf(stderr, "cannot stat %s ", fname);
	  	 perror("");
	  	 exit(1);
	  }
	  return st.st_size;
}

int main(int argc, char **argv)
{
	  if(argc>1)
	  {
	  	 int i=1;
	  	 for( ;argv!=NULL; i++)
	  	 {
	  	 	   size_t len=filesize(argv);
	  	     char *buf=malloc(len);
	  	     FILE *in=fopen(argv, "r");	
	  	     check_null(buf);
	  	     check_null(in);
	  	     readall(fileno(in), buf, len);
	  	     printf("Read whole file %s: %u bytes\n", argv, len);
	  	     free(buf);	  
	  	     fclose(in);	     
	  	 }
	  	 return 0;
	  }
	  return 1;
}

example run:

> time readall bp.dat plpl 7084231.gpx.big before.txt after.txt
Read whole file bp.dat: 6962739 bytes
Read whole file plpl: 6800010 bytes
Read whole file 7084231.gpx.big: 1728412 bytes
Read whole file before.txt: 1584865 bytes
Read whole file after.txt: 1584892 bytes

real    0m0.362s
user    0m0.001s
sys     0m0.065s