Condition variables

Hi,

I am reading through the pthreads tutorial and had a question on the example they have given for condition variables. Here is the code snippet:

This is what the thread waiting on the condition variables doing:

    
pthread_mutex_lock(&count_mutex);
while (count<COUNT_LIMIT) { <--- why is this a 'while' and not an 'if' ?
    pthread_cond_wait(&count_threshold_cv, &count_mutex);
    count += 125;
 }
pthread_mutex_unlock(&count_mutex);

Why is the check for count in the while loop and not an if ?

From what I see, the only advantage of it being in the while loop is if by any chance the signaling thread incorrectly signaled the waiting thread (i.e. it sent a signal even though the value of count was still < COUNT_LIMIT). But in that case, I would expect the code to be something like this:

    
pthread_mutex_lock(&count_mutex);
while (count<COUNT_LIMIT) { 
    pthread_cond_wait(&count_threshold_cv, &count_mutex);
 }
count += 125; <--- I moved this outside of 'while'
pthread_mutex_unlock(&count_mutex);

Im probably missing something.

You can get the complete code here:
https://computing.llnl.gov/tutorials/pthreads/\#ConVarSignal

I think you're right. That code shouldn't be in the while loop.

Look up "spurious wakeups".

The function could return for some reason other than the thread signaling the condition variable. In practice some implementations have an unrestartable pthread_cond_wait() function which could be aborted by a signal, for example. Other failure modes are possible and happen (rarely) on other systems.

In addition I believe one reason for introducing this concept was to enforce careful programming; Don't just assume the condition is implicitly true, but always make sure it really is.

1 Like

The use of a loop for pthread_cond_wait() and pthread_cond_timedwait() is regarded as good programming practice to handle spurious signals since these APIs are generally implemented using signals on Unix and GNU/Linux systems.

For example, Dave Butenhof's book gives the following example on page 79

while (data.value = 0) {
     status = pthread_cond_timedwait ( &data.cond, &data.mutex, &timeout);
     if (status == ETIMEOUT ) {
          printf ("Condition wait timed out.\n");
          break;
     } else if (status != 0)
          err_abort(status, "Wait on condition");
}

It could, but it isn't checking for that! It's just adding to its own variable whenever woken no matter what. It looks wrong.

It makes sense if you look at the entire example code.
See https://computing.llnl.gov/tutorials/pthreads/\#ConditionVariables

1 Like

I should probably devote some times and write a nice article about it on my blog. But, as you surely know, time is a scare resource... so I will merely copy-paste the most important point of a lecture I gave on the matter:

It is a good idea to enclose the condition wait with the equivalent of a while loop that checks the predicate. This has the following advantages:

  • allow use of �loose predicate�. A rule of thumbs says: It's a lot easier to have a �loose predicate� (�it may be�) for the condition variable than using a tight predicate (�it is�) straight away.
  • robust against intercepted wakeup: After returning from pthread_cond_wait(), the predicate may still be false, because another thread has in-between already processed and reset the condition.
  • robust against spurious wakeup: the thread returns from pthread_cond_wait() even if the condvar has not been broadcast/signaled

POSIX states:

spurious wakeup may sound strange; here the whole story recorded from David Butenhof about real origin of the "spurious wakeup" in the standard:

1 Like

I don't get it. How can you possibly use a cond variable if you have no way of knowing if what you did ever worked?

AFAICS, at the time where the watch_count() function in thread T0 is executed, count might already be > COUNT_LIMIT. Right?

signals-based Pthread implementations work, you just have to always handle the spurious signal case.

As an aside, Pthreads implementations on Microsoft Windows suffer less from this particular issue because they are based on the Windows eventing model and have the WaitForSingleObject and WaitForMultipleObject APIs.

Hi the_learner,

You are right: the construct is strange *if* you want to protect again spurious wake-up. Indeed, after waking up (falsely), It would add 125 to count and consequently leave the while() loop. In the spurious case, you usually want to pthread_cond_wait() again.

From the code and comment, we have no chance to know what the intent of the author's code was... Perhaps he wanted for teaching purpose to print the value after waking from the condition; increments it to show that only the thread is active (because it holds the mutex) and terminate the program regardless the count value... if so this should be clearly indicated. From a pedagogical standpoint, this code isn't IMHO good.

Hope you got enough information, to answer your initial question. If you're lost, we can write you a short summary :slight_smile: