named pipe with persistent buffer

heck · January 9, 2007, 7:07am

Hey folks,
i need communicate between 2 processes in a reliable manner. The information delivery has to be guarenteed. I thought about proc 2 sending a signal to proc 1 when information has being written to disc and wirte() has been verified (sync/flush). The IPC method for the data is named pipes.

When proc1 simply writes the data to the named pipe, and the system crashes before proc2 could read it and persistently save it to disc the information is lost.

So my solution to this is:

proc1:
has opened a phys. file, tmpfile; and a named pipe

writes data to tmpfile on disc, verify write()
writes the data to pipe
waits for signal from proc 2

proc2:
has select or poll to the pipe

writes data to disc on pipe change, verifiy write()
sends signal to proc 1, meaning data is persistently saved.

prioc1:
4. emtpys tmpfile

what do you think about this idea?
Is the pipe necessary?
Could i as well read the physical tmpfile from proc2 instead of double write from proc1 to phys. file AND pipe. Performance?

If you have suggestions introducing completly different concepts, please consider that proc1 is an already written program and outputs the data to a physical file. I cant really change proc1, adding a sighandler and a sigsuspend along a tmpfile creation/ emptiyng and a double write are small changes, but already nearly to big for this process (testing). Regulatory reasons (risk).

My question is also, isn't there anything like a named pipe with persistent buffer on disc, and with fast buffer im mem for the usual work. the persistent buffer is copied to membuffer on open(). So data which was written to pipe but not read before crash is not lost (if the crash was after succesfull write to disc pipe internal). The write() call to this kind of pipe should return when data is assuredly written to disc, so the user program can assume once returned from the write() call, data is now delivered, guaranteed.

Sorry for the long story, but i hope you can help me along with this conceptual problem!

Thanks in advance!
Heck

Corona688 · January 9, 2007, 10:28am

Yours is an unusual problem. It's not often you need to guarantee your code works when the system can't. No matter how far down you push it, the data is still going to be waiting in memory sometime -- what if the system dies before you finish writing to your temp file? Even a journalling filesystem can't avoid losing data that hasn't been written yet.

jim_mcnamara · January 9, 2007, 2:20pm

Corona -

You are absolutely correct.

The OP & I had the same discussion earlier in another thread. I assume the OP did not like the first answer, and so tried again.

heck · January 10, 2007, 5:31am

Namara i did like your suggestions in the other thread. Really, i also wanted to come up with an answer but couldn't order my mind right already. I see this thread as a "subproblem".

And ahm... i really think i understand the problem folks. Data which had never the chance to be written to disc can never be recovered(after crash).
JA, THIS IS LIKE THAT! I know.
I didnt want to dispute about this, we dont need to.

The conceptual discussion is aiming at a design of the system wich behaves as well as possible in all thinkable situations. _Recuding_ the POSSIBLE data loss.
So i realized, as you can see in my thread post, that i have to save the data to disc as soon as possbile.

So far my aimings and my concept are NOT nonsense. It is NOT a "Yes you need your computer powered up to run your program" like problem.

Focus was on two questions.

i have to save to disc a soon as possible, but also i need to further process the data. What is more clever?
a.) If i duplicate data as being written to disc AND to Pipe.Disc will never be read if pipe reader (proc2) signals that he has written the pipe data to disc itself. It is just to be transaction save.
b.) I store it in a physical file which i can be read anyway from proc2. So i can forget the pipe.
Is there anything like a named pipe with a persistent buffer (disc). (My answer: yes MQseries from IBM - just slightly an overkill) no Really?

And i conclude, yes it is a little bit of an uncommom problem. But were simply a the point, where we cant say, "ahh if the Machine screws up sometime, everything is screwed up anyway."
We have to be able to say. If the Machine screws up, we are prepared the best we can. It is not a devil which will never gonna happen. It is a case which can and over the years WILL happen in PRODUCTION environment, and we have to be prepared to handle it the best we can.

Corona688 · January 12, 2007, 2:29am

But they are. No matter how far down you push it, the data is still going to be waiting in memory sometime. You didn't just ask for "improved", you demand "guaranteed". Your requirements are simply impossible.

If you want transaction safety, use a database of some sort. Of course, even a database isn't magical -- it can't store data that hasn't been written yet.

Unfortunately, it's true. If the machine screws up, nothing can be guaranteed, no matter how rube goldbergian you make your data path.

Right. That is what things like backups, failovers, RAID arrays, redundant power supplies, uninterruptible power supplies, generators, software sandboxing, extensive testing and detailed testing procedures are for. These will be much more useful than demanding magic software that can successfully save irreplaceable data to a hard drive that, perhaps, has no power.