Linux Page Sharing

kumaran_5555 · September 17, 2011, 12:06pm

Hi,

I have following doubts regarding page sharing in Linux (please also correct me if any of my assumptions are wrong),

In recent kernel KSM does the page sharing for user process' anonymous pages.
What about pages where the program text is stored ? are they shared between two unrelated processes (by unrelated i mean they have no parent/child/sibling relationship) ?.
Parent and child shares these pages when they are forked and didn't execute a new program.

The thing I really wanted to know is, if there are two processes which have no relationship between them but they have some pages with same content, then is it possible for the kernel to identify them and use a single copy.
( pages can be of text area,user's anon pages)

Please help with your inputs.

Thanks

Corona688 · September 17, 2011, 2:47pm

No merging is necessary to share program text -- program text is 100% shared already, because it's loaded with memory mapping.

It's easy to share file-backed memory maps because they're not anonymous. No contents need to be checked, just locations. Map the same location, get the same pages.

fpmurphy · September 18, 2011, 1:29pm

From .../kernel/Documentation/vm/ksm.txt

kumaran_5555 · September 19, 2011, 2:47am

Thanks for all your input. Can anyone explain me little about how program text sharing happens in Linux while loading the program.

42416000    1568     572       0 r-x--  libc-2.13.90.so
4259e000       8       8       4 r----  libc-2.13.90.so
425a0000       4       4       4 rw---  libc-2.13.90.so

This is what I found in pmap of bash process, suppose if another bash process is started will these areas will be shared, how they are shared while new bash is created. (is there any info that kernel keeps to know that these files are loaded at these parts)

Please show me some light in this area.

Corona688 · September 22, 2011, 1:28pm

It's done with memory mapping.

$ cat owls.c
int main(void)
{
        int fd=open("filename", O_RDWR, 0660);
        // Map the first page of bytes into 'mem'.
        // getpagesize() is 4096 or 8192 bytes on most systems.
        void *mem=mmap(NULL, getpagesize(), PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
        // Make sure the file is long enough.
        // The empty space in the file will be filled with NULLs.
        ftruncate(fd, getpagesize());
        if(mem == MAP_FAILED)
        {
                perror("Couldn't map");
                close(fd);
                return(1);
        }

        printf("Old string was:  '%s'\n", mem);

        strcpy(mem, "THE OWLS ARE NOT WHAT THEY SEEM\n");
        munmap(mem, getpagesize());
        close(fd);
        return(0);
}

$ rm -f filename
$ gcc owls.c -o owls
$ ./owls
Old string was: ''
$ ./owls
Old string was: 'THE OWLS ARE NOT WHAT THEY SEEM
'
$ cat filename
THE OWLS ARE NOT WHAT THEY SEEM
$

Any dynamically-linked code is loaded in this fashion, though it'd be mapped read-only, not read-write.

I suppose the kernel would just need to track the inode number and partition id of memory mapped from files. If someone tries to map the same inode on the same partition, and the area being mapped intersects, some or all of it may be shared.

The memory savings is deeper than just not reloading the same library 23 times. The kernel uses hardware features of the CPU itself to be notified when a process tries to access mapped pages of memory -- like a segmentation fault, except instead of killing the process, the kernel freezes it. Once the memory's loaded, the kernel lets it continue. This allows the kernel to only load memory pages which you're actually using, rather than blindly loading the entire file.

Memory may eventually be paged back out if it falls into disuse, as well. In this manner, mapped segments can operate on things larger than the entire available memory on your system. Many large things like databases use memory mapping to operate on their files.