Shared memory in shared library

I need to create a shared library to access an in memory DB. The DB is not huge, but big enough to make it cumbersome to carry around in every single process using the shared library. Luckily, it is pretty static information, so I don't need to worry much about synchronizing the data between instances of the shared library. However, what I do need to worry about is initializing the memory once and never again.

I would love all the shared libraries to have a library scoped variable, but I don't think that's possible. If it is, let me know...but I know normally each process gets its own copy of the shared library's data segment. I don't think there is a way to flag a portion of that data segment as shared across all instances of the library.

So, I guess I'll need to have the library access the DB in shared memory. However, the first instance of the library to be started up needs to create the shared memory, attach to it, and load it all before any other possible instances can use it. How do I guarentee that this create/initialize happens once before the shared memory is used?

At first, I thought the non-existance of the shared memory segment would do it...but there exists a race condition between create and initialize where-in another instance of the library would see the shared memory but not be able to access it because it has not yet been inited. I know I can store posix mutex's in shared memory...so I guess I can use one of them. However, the other problem exists when the shared memory segment is "left over" from previous runs.

What happens then? So...all library's "detach" from the segment, but none of them destroy it. Now what...the next time the library starts fresh it should re-init the shared memory, but won't.

Hummm...any ideas? Some direction? What is the common method of providing shared memory syncronization? Should I use a system semaphore? Wouldn't it suffer from the same "warm" start problem (detached processes from shared library but memory/semaphores still resident).

This will be written for AIX 5, BTW, for those interested.

You could have an additional process who's job it is to create and shutdown the shared memory. This process could have some form of IPC to so the other processes can attach and detach, when the last goes, the manager process can clean up and die.

You could have the manage process fork/exec'd by the client libraries when they start so it does not have to prestart, and have setuid bit to run with the appropriate rights other than the first user to start up.

If the clients connect to the manager process with a UNIX domain socket then the manager could use poll() on all connections to monitor that the processes are alive, if they exit uncleanly the socket connection will still die.

I was considering this approach, and it may work...but I thought to myself, why bother use shared memory if I'm going to have a socket connection. In which case, I may as well just relay the entire request to the manager process and let him decision it and not make the memory shared.

Hummm....

Another shared memory question:

Storing pointers in shared memory (of course to other areas of shared memory), can this be done? I think the answer lies in how you attach the shared memory to the process, no? Looking at the shmat call, the second parameter specifies the memory address. I assume this is the "base address" given to the process for the shared memory segment. I would guess that pointers will only be valid throughout all applications accessing the shared memory iff all applications specify this parameter similarly when attaching the segment. Is this correct?

If the above is correct, then I would guess that I may compete with other applications for the address I want to attach to (being a library and all) and that I may not easily be able to guarantee that I can get the address I want. To prevent this, it seems the OS allows one to pass NULL to this parameter and the OS will chose an available address to map to. In which case, I could not really store pointers in shared memory, rather I must store offsets and let the application compute the actual pointer value by adding its individual "base address." This is obviously performance draining....

So...how is that "obstacle" usally overcome?

  1. Understand the process memory map on a particular OS.

  2. attach the shared memory as early as possible in the process startup to prevent that memories use by other later activities.

  3. if you fork, you will have two processes using the shared memory at the same address, I can't confirm if when you exec() the shared-memory actually gets detached. Would be worth finding out?

It is the 2 part that worries me. Writing a shared library that "insists" that it gets a certain region of memory to attach to seems prone to issue. However, writing it to attach to various different regions seems prone to issue too because all applications must attach to the same region if pointers are to be valid across them.

Humm...maybe it should be configurable...doesn't seem fair, however, to make the user set a value. Grrrrrrrr... Always a trade-off...make it fast, or make it reliable.... The reliable route would be to store all "pointers" as offsets...or page/offset pair. But, that means that the application must always translate my pointer type into the actual pointer.... Slow...tedious, error-prone. Grrrrrrrrrrrrr!!!!!!!!!

I do not share your trepidation regarding the performance hit. This is virtually the definition of of an array reference is performed and I use arrays quite a bit. Switching your app entirely to arrays and never using pointers at all might actually improve performance provided that you use the optimizer. In any event, many implementations to not allow you to choose the address of a shared memory segment and portable code should not rely on having that option. Shared libraries are compiled using PIC (position independent code) despite the fact that there is often a minor performance hit with PIC. Shared data segments should also be position independent. It's the cost of doing business.

That works beautifully if I want to partition the shared memory up into several buckets and reference each bucket by its index. However, assuming the DB is made up of differently sized information, I must either pick a bucket size big enough to store anything (and waste space on smaller things) or I allocate dynamically sized buckets and pass around pointers (as indexes no longer work).

Essentially, I was thinking I could create a version of malloc that operated within a shared memory region and then use it to allocate items in the DB dynamically to be stored in a chained hash table.

The "performance hit" on pointers is that I need to store the "pointer" to the bucket that I allocated (via my malloc routine) in shared memory somehow. Either that pointer is a native pointer into shared memory, or it is an offset into shared memory that every time application code goes to access a pointer it'll need to perform a conversion routine against it to acquire its position independant address. This would be required for either the array or non-native pointer methods. I guess I could tell the application (in the non-native pointer method) that the shared memory is a huge array of characters and access pointers through an "index" into the character array cast to the appropriate data type...but this seems just as ugly.

Maybe creating an intermediate "malloc" library against a shared memory segment is silly...but I don't know of a better way to store various dynamically sized data into any memory segment without wasting space on static sized buckets.

That is more or less what I had in mind. Remember, I claim this is not terribly inefficient. I did not mean to claim that it would be beautiful. You might be able to use some macros to help make it less ugly.

Doable...certainly...I wonder what the compiler does for PIC code. Lucky for it, however, to be able to wrap around every memory assignment and usage in the code. I have to procedurally enforce such conversion, and I'm sure someone will break my "homebrew PIC" code, lol.

But, you're probably right, it is likely not too inefficient. However, add on the "easy to break" factor and...well.... I'm basically trading a slight inefficiency and some code maintainability for portability. The question is, how worth it is it? The only nice thing I see about this method is that I can (at a later time) extend it to multiple shared memory segments with ease. Obviously the the native pointer method does that by default...but asking for more than a few segments all attached to specific points in memory could start to get hairy. This way I'm assured I can extend my memory requirements into the multi-segment territory without fearing I can no longer attach one at a position depedant location.

Hummm.... I wonder if this is even a reasonable method of accessing the shared memory.... I mean...I wonder if any other application has created their own shared memory allocator or if they just bucket up individual segments of shared memory for each of their data types (seems like a bad idea given you can only attach a limited number of segments per process).

Any good books? I mean...ones that advise you how to use shared memory, not just explain the interface to it.

Try the Richard W Stevens books.

Also the same rules that apply to multithreading apply to shared memory usage, eg, guard it well.

Yeah...I have the "Advanced Programming in the Unix Environment" book, it tells you about the interface and explains a bit...but I'd like to see more practical examples/case studies on its usage.

Thanks, though...I doubt there's much like that...I'll just have to be creative, lol.

Interprocess Communications in Linux�: The Nooks & Crannies by S Gray

Assuming you are using Linux or a unix with /proc filesystem. It gives examples of most situations.

.. edit : my bad you're on AIX. All bets are off.

Yeah...a pity, I know...lol.