Thread-safe Libraries

In the beginning (and, really, up till a few years ago), UNIX machines only allowed a single thread per process. During this period, many libraries were written (including the standard C library) and the authors natrually assumed that processes had one thread each. So they stored data in global variables and didn't bother with synchronization.

Threads were introduced, and people started writing multi-threaded programs. Quite often, several threads would be using library functions at the same time - this caused problems; though the programmer had thought they were being safe, the library was corrupting things (remember that libraries are just pieces of code that you don't have to write - otherwise, they're the same as if you had written the code).

For example, lets imagine had been written scanf(3) as follows:
scanf()
static char buf[256];
int scanf(char *format, ...) {
    read(STDIN, buf, 255);
    work on buf to pull out matches
    return matches
}
What's wrong? Well, buf is a static variable, and is thus shared across the entire program. If two threads call scanf at the same time, they'll wind up overwriting the buf with some incorrect values.

Now, this example is fairly trivial to fix (we can just allocate a new buffer on each call to scanf), but others were not. The simplest solution was often just to add a lock around the entire function:
scanf()
static char buf[256];
static lock_t scanf_lock;
int scanf(char *format, ...) {
    acquire scanf_lock;
    read(STDIN, buf, 255);
    work on buf to pull out matches
    release scanf_lock;
    return matches
}

The addition of the lock makes this function thread-safe: it will produce the correct results if it is accessed by multiple threads simultaneously.

However, it isn't very efficient; there isn't really any reason a thread should have to wait for all the other threads to be done before it can use scanf (well, actually, there might be interesting problems with read(2) - but we'll ignore those). This means the function is not MP-efficient.

pthreads vs. LinuxThreads

pthreads refer to "POSIX threads," meaning threads that behave as specified in the POSIX standard 1003.1c. LinuxThreads is a library that proivdes POSIX threads for Linux. It uses the clone(2) syscall to create new threads and also provides the synchronization functions needed to implement the POSIX 1003.1c specification. While I'll use functions from pthreads, the concepts here are general across all systems (in fact, I copied my notes from a another class which did not use pthreads and just substituted the pthread functions :).

Synchronization Primitives

We'll discuss two concepts that are independent parts of synchronization:
  1. Mutual exclusion: Protection of some data from simultaneous access by multiple threads.
  2. Inter-thread scheduling: Communication between threads that some event happened; blocking for events in other threads.
General points about synchronization:

Locks/Mutexs

Thread Exit/Join

Semaphores

Monitors/Condition Variables

Condition Variables - Why?

So, why do we need these condition variable things anyway? We'll use a common example, an unlimited size buffer, and attempt to solve it without the condition variable.

We'll assume the following code is used to initialize the system appropriatly:
Initialization
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t data_available = PTHREAD_COND_INITIALIZER;
buffer = empty buffer;

The correct version, with condition variables. We assume that two threads are using the code, one calling AddToBuffer and another calling RemoveFromBuffer.
AddToBufferRemoveFromBuffer
pthread_mutex_lock(&lock);
put item in buffer;
pthread_cond_signal(&data_available);
pthread_mutex_unlock(&lock);
pthread_mutex_lock(&lock);
while (nothing in buffer) {
    pthread_cond_wait(&data_available, &lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;

First, lets try just removing condition variable:
AddToBufferRemoveFromBuffer
pthread_mutex_lock(&lock);
put item in buffer;
pthread_mutex_unlock(&lock);
pthread_mutex_lock(&lock);
while (nothing in buffer) {
    ;
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
Obviously no good - the while loop holds the lock, meaning no way to run AddToBuffer, so while will run forever....

OK, so release the lock in the while loop (but remember to reacquire it before checking if anything is in the buffer):
AddToBufferRemoveFromBuffer
pthread_mutex_lock(&lock);
put item in buffer;
pthread_mutex_unlock(&lock);
pthread_mutex_lock(&lock);
while (nothing in buffer) {
    pthread_mutex_unlock(&lock);
    pthread_mutex_lock(&lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
This code will work (surprisingly), but very slowly - if one thread enters RemoveFromBuffer (with an empty buffer) and then another thread enters AddToBuffer, the only way to make progress is to context switch out of RemoveFromBuffer in-between the two lock statements. That's no good.

Well, how about we stick a sched_yield(2) in there to speed things up a bit - that way, at least the scheduler will have a better chance of context switching us there. (That the algorithm is correct, but very inefficient, is a good sign we have an inter-thread scheduling problem and not a mutual exclusion problem).
AddToBufferRemoveFromBuffer
pthread_mutex_lock(&lock);
put item in buffer;
pthread_mutex_unlock(&lock);
pthread_mutex_lock(&lock);
while (nothing in buffer) {
    pthread_mutex_unlock(&lock);
    sched_yield();
    pthread_mutex_lock(&lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
That is looking a little better, but a yield isn't quite right - it is quite likely we will have to go around the while loop many, many times before some thread actually adds anything to the buffer. What we really want is to sleep until there is data available, and have AddToBuffer wake us up.

Lets try and do that by adding a queue to hold waiting threads (call it waiters). AddToBuffer will take the first thread off of the queue and wake it up. We can then change the sched_yield(2) to a sleep(3), since AddToBuffer should wake us up...
AddToBufferRemoveFromBuffer
pthread_mutex_lock(&lock);
put item in buffer;
if (waiters not empty)
    waiters.next().sendAlarm(); // wakeup time

pthread_mutex_unlock(&lock);
pthread_mutex_lock(&lock);
while (nothing in buffer) {
1    pthread_mutex_unlock(&lock);
2    waiters.enqueue(this thread);
3    sleep(1000000); // sleep a long time

4    pthread_mutex_lock(&lock);
}
remove item from buffer;
pthread_mutex_unlock(&lock);
return item;
OK, that looks good. Where is the problem? Consider a context switch from thread A between lines 1 and 2 in RemoveFromBuffer to thread B just entering AddToBuffer. Thread B would successfully add the item to the buffer, and then check if any waiters were available. Since thread A has not yet added itself to the waiters queue, it wouldn't find any, and it would return. At some point, thread A would run again, and would sleep - even though there is actually data available in the buffer.

What we need is an atomic version of lines 1 through 4. Hmm, that happens to be exactly what a condition variable does. Note that we were able to solve the locking/consistency part of the problem without condition variables, but to get the scheduling part right, we needed them.

You may also be wondering: When would we ever need more than 1 condition variable? Here is a quick example where we add another CV, bufferEmpty:
AddToBufferRemoveFromBufferHandleAnEmptyBuffer
pthread_mutex_lock(&lock);
put item in buffer;
pthread_cond_signal(&data_available); pthread_mutex_unlock(&lock);
pthread_mutex_lock(&lock);
while (nothing in buffer) {
    pthread_cond_wait(&data_available, &lock);
}
remove item from buffer;
if (buffer empty)
    bufferEmpty.wakeAll();
pthread_mutex_unlock(&lock);
return item;
pthread_mutex_lock(&lock);
while (something in buffer) {
    pthread_cond_wait(&buffer_empty, &lock);
}
yell at roommate for hosing the network;
pthread_mutex_unlock(&lock);
Basically, anytime you have multiple different conditions, use another condition variable - pretty simple, really.