Signalsafe data structures

Hello,

I have a signal handler which manipulates a data structure. The data structure's operations aren't atomic. So if two threads/processes are in a critical section at the same time the data structure will be broken.

With threads you can avoid this stuff with semaphores etc.
However, signal handlers work a bit different than threads. If a signal handler is currently manipulating the data structure and during this the signal handler is called again, then the following happens:

It cannot access the data structure because the first signal handler has it locked, which results in a deadlock.

How can I avoid it but still manipulate my data structure? I cannot lose signals either.

The general way to deal with this is: don't. In your signal handler, do as little work as possible - i.e. set/increment a flag to let your program know the signal has been caught, and do the work from the main body of the code.

In the case of incrementing a flag, you might want to have a look at __sync_fetch_and_add() and such in the gcc documentation - do a search for "gcc atomic builtins".

Ok, but I need to safe an integer value retrieved by the signal handler. Can I do that?
(If multiple signals come in I have to safe multiple integer values)

Working on it. It's harder than it sounds to guarantee order AND guarantee safety.

---------- Post updated at 11:18 AM ---------- Previous update was at 11:09 AM ----------

How about this:

#define SIGBUF_SIZE 1024

struct
{
        unsigned int rdpos, wrpos;
        int buffer[SIGBUF_SIZE];
} sigbuf={ 0, 0 };

int put_value(int val)
{
        unsigned int wrpos=__sync_fetch_and_add(&sigbuf.wrpos, 1);
        unsigned int rdpos=__sync_fetch_and_add(&sigbuf.rdpos, 0);
        // Compare this way, to prevent integer wraparound problems!
        unsigned int off=(wrpos-rdpos);

        if(off >= SIGBUF_SIZE)
        {
                __sync_fetch_and_sub(&sigbuf.wrpos, 1);
                return(-1);
        }

        sigbuf.buffer[wrpos % SIGBUF_SIZE]=val;
        return(0);
}

int get_value(void)
{
        unsigned int wrpos=__sync_fetch_and_add(&sigbuf.wrpos, 0);
        // Do NOT add here!  If a signal happens, it may see a 'free'
        // element even though there's not.  Only add once we're sure
        // there's anything to read.
        unsigned int rdpos=__sync_fetch_and_add(&sigbuf.rdpos, 0);
        unsigned int off=(wrpos-rdpos);

        if(off == 0)    // Empty buffer
                return(-1);

        return(sigbuf.buffer[__sync_fetch_and_add(&sigbuf.rdpos, 1)%SIGBUF_SIZE]);
}

basically, rdpos and wrpos get incremented without limit, and % SIGBUF_SIZE is used to wrap their values to inside the array. We do 'unsigned int off=(wrpos-rdpos)' instead of just comparing rdpos and wrpos because, if it hits integer wraparound, rdpos might be near INT_MAX while rdpos is practically zero. Subtracting gets rid of the wraparound.

You shouldn't call get_value in a signal handler, but put_value ought to be safe.

As others said, you should be very wary of doing work in a signal handler. There is a short list of async-signal safe functions. If you're calling anything outside that list, there is already potential for breakage even disregarding that caused by potential data structure corruption.

I would do as recommended, restructure your code to capture the signal and relay it outside the signal handler to be handled by the main thread of execution. You can use, for example, a pipe to put a byte on signaling something has been caught; since write is safe you can call it.

Even the pthread_mutex_lock is async-signal unsafe. Outside of rolling your own async-signal safe locks using atomic variables, I can't suggest anything else. Not to mention that even if you properly protect the data structure, you're still limited in what you can do to it by the lack of safe functions you can call.

Locking the list is not an option.

If the list is locked and another signal comes in while the list is locked, the 2nd signal handler will go into a deadlock, because the 1st signal handler waits till the 2nd one finishes, hence the list is never freed.

So writing my caught signal info into a pipe works? How much data can a pipe hold? And how do I know how much I have to read from the pipe when n signal handlers wrote something into it?

/edit:
I just realized that write() doesn't work either, because it can be interrupted before writing something, in which case it throws an error.

Is the code I wrote at your request, weeks ago, somehow insufficient or broken?

I now realized that I need to save two values per array position instead of one. Now my question is how to do this without breaking it.

Fill the array with structs holding two elements (integers)? But then I need to return a struct with get_value. Is that OK?

That's a trivial change.

#define SIGBUF_SIZE 1024

typedef struct sigdata
{
        int a;
        int b;
} sigdata;

struct
{
        unsigned int rdpos, wrpos;
        sigdata buffer[SIGBUF_SIZE];
} sigbuf={ 0, 0 };

int put_value(sigdata val)
{
        unsigned int wrpos=__sync_fetch_and_add(&sigbuf.wrpos, 1);
        unsigned int rdpos=__sync_fetch_and_add(&sigbuf.rdpos, 0);
        // Compare this way, to prevent integer wraparound problems!
        unsigned int off=(wrpos-rdpos);

        if(off >= SIGBUF_SIZE)
        {
                __sync_fetch_and_sub(&sigbuf.wrpos, 1);
                return(-1);
        }

        sigbuf.buffer[wrpos % SIGBUF_SIZE]=val;
        return(0);
}

// use like:
//         sigdata d;
//         get_value(&d);
int get_value(sigdata *val)
{
        unsigned int wrpos=__sync_fetch_and_add(&sigbuf.wrpos, 0);
        // Do NOT add here!  If a signal happens, it may see a 'free'
        // element even though there's not.  Only add once we're sure
        // there's anything to read.
        unsigned int rdpos=__sync_fetch_and_add(&sigbuf.rdpos, 0);
        unsigned int off=(wrpos-rdpos);

        if(off == 0)    // Empty buffer
                return(-1);

        (*val)=sigbuf.buffer[__sync_fetch_and_add(&sigbuf.rdpos, 1)%SIGBUF_SIZE]);
        return(0);
}

I use a pointer instead of returning a struct, because that allows you to tell when the buffer was empty -- 0 means it got a value, -1 means nothing was found.

Ok, this doesn't work either. Because those functions are not supported on the target machine.

That might have been good to know weeks ago.

What is the target machine?