Condition variables

the_learner · December 3, 2010, 2:42pm

Hi,

I am reading through the pthreads tutorial and had a question on the example they have given for condition variables. Here is the code snippet:

This is what the thread waiting on the condition variables doing:

    
pthread_mutex_lock(&count_mutex);
while (count<COUNT_LIMIT) { <--- why is this a 'while' and not an 'if' ?
    pthread_cond_wait(&count_threshold_cv, &count_mutex);
    count += 125;
 }
pthread_mutex_unlock(&count_mutex);

Why is the check for count in the while loop and not an if ?

From what I see, the only advantage of it being in the while loop is if by any chance the signaling thread incorrectly signaled the waiting thread (i.e. it sent a signal even though the value of count was still < COUNT_LIMIT). But in that case, I would expect the code to be something like this:

    
pthread_mutex_lock(&count_mutex);
while (count<COUNT_LIMIT) { 
    pthread_cond_wait(&count_threshold_cv, &count_mutex);
 }
count += 125; <--- I moved this outside of 'while'
pthread_mutex_unlock(&count_mutex);

Im probably missing something.

You can get the complete code here:
https://computing.llnl.gov/tutorials/pthreads/\#ConVarSignal

Corona688 · December 3, 2010, 2:54pm

I think you're right. That code shouldn't be in the while loop.

Driver · December 6, 2010, 8:24am

Look up "spurious wakeups".

The function could return for some reason other than the thread signaling the condition variable. In practice some implementations have an unrestartable pthread_cond_wait() function which could be aborted by a signal, for example. Other failure modes are possible and happen (rarely) on other systems.

In addition I believe one reason for introducing this concept was to enforce careful programming; Don't just assume the condition is implicitly true, but always make sure it really is.

fpmurphy · December 6, 2010, 9:20am

The use of a loop for pthread_cond_wait() and pthread_cond_timedwait() is regarded as good programming practice to handle spurious signals since these APIs are generally implemented using signals on Unix and GNU/Linux systems.

For example, Dave Butenhof's book gives the following example on page 79

while (data.value = 0) {
     status = pthread_cond_timedwait ( &data.cond, &data.mutex, &timeout);
     if (status == ETIMEOUT ) {
          printf ("Condition wait timed out.\n");
          break;
     } else if (status != 0)
          err_abort(status, "Wait on condition");
}

Corona688 · December 6, 2010, 10:25am

It could, but it isn't checking for that! It's just adding to its own variable whenever woken no matter what. It looks wrong.

fpmurphy · December 6, 2010, 2:09pm

It makes sense if you look at the entire example code.
See https://computing.llnl.gov/tutorials/pthreads/\#ConditionVariables

Loic_Domaigne · December 7, 2010, 4:49pm

I should probably devote some times and write a nice article about it on my blog. But, as you surely know, time is a scare resource... so I will merely copy-paste the most important point of a lecture I gave on the matter:

It is a good idea to enclose the condition wait with the equivalent of a while loop that checks the predicate. This has the following advantages:

allow use of �loose predicate�. A rule of thumbs says: It's a lot easier to have a �loose predicate� (�it may be�) for the condition variable than using a tight predicate (�it is�) straight away.
robust against intercepted wakeup: After returning from pthread_cond_wait(), the predicate may still be false, because another thread has in-between already processed and reset the condition.
robust against spurious wakeup: the thread returns from pthread_cond_wait() even if the condvar has not been broadcast/signaled

POSIX states:

spurious wakeup may sound strange; here the whole story recorded from David Butenhof about real origin of the "spurious wakeup" in the standard:

"dave butenhof":

POSIX threads were the result of a lot of tension between pragmatic hard real*time programmers and largely academic researchers.The intent was to force correct/robust code by requiring predicate loops. This was driven by the provably correct academic contingent among the "core threadies" in the working group, though I don't think anyone really disagreed with the intent once they understood what it meant.

We followed that intent with several levels of justification. The first was that "religiously" using a loop protects the application against its own imperfect coding practices. The second was that it wasn't difficult to abstractly imagine machines and implementation code that could exploit [spurious wakeup] to improve the performance of average condition wait operations through optimizing the synchronization mechanisms.

Actually, no member of the working group never proved that such an implementation exists. Spurious wakeups are the mechanism of an academic computer scientist clique to make sure that everyone had to write clean code that checked and verified predicates!

But the (perhaps) largely spurious (or at least arcanely philosophical) 'efficiency' argument went over better with the real*time people, and the real reason was usually relegated to second place in the rationale.

I've thought many times about how you might construct a correct and practical implementation that would really have spurious wakeups. I've never managed to construct an example. Doesn't mean there isn't one, though, and it makes a good story.

Corona688 · December 8, 2010, 10:35am

I don't get it. How can you possibly use a cond variable if you have no way of knowing if what you did ever worked?

Loic_Domaigne · December 8, 2010, 12:39pm

AFAICS, at the time where the watch_count() function in thread T0 is executed, count might already be > COUNT_LIMIT. Right?

fpmurphy · December 8, 2010, 2:46pm

signals-based Pthread implementations work, you just have to always handle the spurious signal case.

As an aside, Pthreads implementations on Microsoft Windows suffer less from this particular issue because they are based on the Windows eventing model and have the WaitForSingleObject and WaitForMultipleObject APIs.

Loic_Domaigne · December 9, 2010, 12:25am

Hi the_learner,

From what I see, the only advantage of it being in the while loop is if by any chance the signaling thread incorrectly signaled the waiting thread (i.e. it sent a signal even though the value of count was still < COUNT_LIMIT). But in that case, I would expect the code to be something like this:
   
pthread_mutex_lock(&count_mutex);
while (count<COUNT_LIMIT) { 
   pthread_cond_wait(&count_threshold_cv, &count_mutex);
 }
count += 125; <--- I moved this outside of 'while'
pthread_mutex_unlock(&count_mutex);
Im probably missing something.

You are right: the construct is strange *if* you want to protect again spurious wake-up. Indeed, after waking up (falsely), It would add 125 to count and consequently leave the while() loop. In the spurious case, you usually want to pthread_cond_wait() again.

From the code and comment, we have no chance to know what the intent of the author's code was... Perhaps he wanted for teaching purpose to print the value after waking from the condition; increments it to show that only the thread is active (because it holds the mutex) and terminate the program regardless the count value... if so this should be clearly indicated. From a pedagogical standpoint, this code isn't IMHO good.

Hope you got enough information, to answer your initial question. If you're lost, we can write you a short summary