Definitions of Non-blocking, Lock-free and Wait-free

Monday, 25 January 2010

There was a post on comp.programming.threads this morning asking for a definition of these terms. To write good multithreaded code you really need to understand what these mean, and how they affect the behaviour and performance of algorithms with these properties. I thought it would be therefore be useful to provide some definitions.

Definition of Blocking

A function is said to be blocking if it calls an operating system function that waits for an event to occur or a time period to elapse. Whilst a blocking call is waiting the operating system can often remove that thread from the scheduler, so it takes no CPU time until the event has occurred or the time has elapsed. Once the event has occurred then the thread is placed back in the scheduler and can run when allocated a time slice. A thread that is running a blocking call is said to be blocked.

Mutex lock functions such as std::mutex::lock(), and EnterCriticalSection() are blocking, as are wait functions such as std::future::wait() and std::condition_variable::wait(). However, blocking functions are not limited to synchronization facilities: the most common blocking functions are I/O facilities such as fread() or WriteFile(). Timing facilities such as Sleep(), or std::this_thread::sleep_until() are also often blocking if the delay period is long enough.

Definition of Non-blocking

Non-blocking functions are just those that aren't blocking. Non-blocking data structures are those on which all operations are non-blocking. All lock-free data structures are inherently non-blocking.

Defintion of lock-free

A lock-free data structure is one that doesn't use any mutex locks. The implication is that multiple threads can access the data structure concurrently without race conditions or data corruption, even though there are no locks — people would give you funny looks if you suggested that std::list was a lock-free data structure, even though it is unlikely that there are any locks used in the implementation.

Just because more than one thread can safely access a lock-free data structure concurrently doesn't mean that there are no restrictions on such accesses. For example, a lock-free queue might allow one thread to add values to the back whilst another removes them from the front, whereas multiple threads adding new values concurrently would potentially corrupt the data structure. The data structure description will identify which combinations of operations can safely be called concurrently.

Finally, just because a data structure is lock-free it doesn't mean that threads don't have to wait for each other. If an operation takes more than one step then a thread may be pre-empted by the OS part-way through an operation. This may then mean that the data structure is now in an inconsistent state, and other threads cannot perform any operations on the data structure until the operation is complete. These threads must then continually check the state of the data structure in what is called a busy-wait or spin-wait. Such waits are not blocking, and will consume processor time because the operating system is not aware of the conditions for the wait, and cannot suspend execution of the threads. If you use a lock-free data structure with this property then high rates of access from multiple threads can seriously cripple the performance: the waiting threads may hinder the execution of the one thread that can actually make any progress. This is especially apparent if the waiting threads are higher priority than the thread being waited for, and is why wait-free data structures are so important.

Definition of wait-free

A wait-free data structure is a lock-free data structure with the additional property that every thread accessing the data structure can make progress, even if every other thread accessing the data structure is suspended part-way through an operation.

This property means that high-priority threads accessing the data structure never have to wait for low-priority threads to complete their operations on the data structure, and every thread will always be able to make progress when it is scheduled to run by the OS. For real-time or semi-real-time systems this can be an essential property, as the indefinite wait-periods of blocking or non-wait-free lock-free data structures do not allow their use within time-limited operations.

The downside of wait-free data structures is that they are more complex than their non-wait-free counterparts. This imposes an overhead on each operation, potentially making the average time taken to perform an operation considerably longer than the same operation on an equivalent non-wait-free data structure.

Choices

When choosing a data structure for a given task you need to think about the costs and benefits of each of the options.

A lock-based data structure is probably the easiest to use, reason about and write, but has the potential for limited concurrency. They may also be the fastest in low-load scenarios.

A lock-free (but not wait-free) data structure has the potential to allow more concurrent accesses, but with the possibility of busy-waits under high loads. Lock-free data structures are considerably harder to write, and the additional concurrency can make reasoning about the program behaviour harder. They may be faster than lock-based data structures, but not necessarily.

Finally, a wait-free data structure has the maximum potential for true concurrent access, without the possibility of busy waits. However, these are very much harder to write than other lock-free data structures, and typically impose an additional performance cost on every access.

Posted by Anthony Williams
[/ threading /] permanent link
Tags: , , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the right.

4 Comments

Can you give an example of a lock-free structure that is not wait-free? This would make the distinction clearer.

by Anonymous at 12:52:17 on Monday, 25 January 2010

Sir, Would it be possible for you to give a example of wait-free and lock free data structure ?

thanks

by shankha at 05:11:12 on Tuesday, 09 February 2010
For the lock-free I can say the keyboard and most of input devices. for the wait-free hash tables is the best example.
by Emad at 01:58:54 on Thursday, 29 April 2010

The definition of lock-free that I've seen is that at every step, _some_ thread makes progress. That is, even if the OS suspends a thread while it has the data structure in an inconsistent state, another thread must be able to recover and keep going. This is weaker than wait-free since it doesn't require that each thread make progress. The simplest way to see the difference is:

int cas_increment(atomic<int>& a) { int old_value = a.load(); while(a.compare_exchange_weak(old_value, old_value + 1)) { } }

This is lock-free but not wait-free. Each execution of the CAS either succeeds or fails. If it succeeds, the executing thread has made progress. If it fails, that signals that some other thread successfully wrote into a and so made progress.

Your definition, on the other hand, allows spin-locks in "lock-free" structures, which seems to make the definition not so useful.

+1 to the rest of the post though.

by Jeffrey Yasskin at 07:07:50 on Wednesday, 19 May 2010

Add your comment

Your name:

Your URL:

Email address:

Person or spambot?

Your comment: