Definitions of Non-blocking, Lock-free and Wait-free

Tuesday, 07 September 2010

There have repeatedly been posts on comp.programming.threads asking for a definition of these terms. To write good multithreaded code you really need to understand what these mean, and how they affect the behaviour and performance of algorithms with these properties. I thought it would be therefore be useful to provide some definitions.

Definition of Blocking

A function is said to be blocking if it calls an operating system function that waits for an event to occur or a time period to elapse. Whilst a blocking call is waiting the operating system can often remove that thread from the scheduler, so it takes no CPU time until the event has occurred or the time has elapsed. Once the event has occurred then the thread is placed back in the scheduler and can run when allocated a time slice. A thread that is running a blocking call is said to be blocked.

Mutex lock functions such as std::mutex::lock(), and EnterCriticalSection() are blocking, as are wait functions such as std::future::wait() and std::condition_variable::wait(). However, blocking functions are not limited to synchronization facilities: the most common blocking functions are I/O facilities such as fread() or WriteFile(). Timing facilities such as Sleep(), or std::this_thread::sleep_until() are also often blocking if the delay period is long enough.

Definition of Non-blocking

Non-blocking functions are just those that aren't blocking. Non-blocking data structures are those on which all operations are non-blocking. All lock-free data structures are inherently non-blocking.

Spin-locks are an example of non-blocking synchronization: if one thread has a lock then waiting threads are not suspended, but must instead loop until the thread that holds the lock has released it. Spin locks and other algorithms with busy-wait loops are not lock-free, because if one thread (the one holding the lock) is suspended then no thread can make progress.

Defintion of lock-free

A lock-free data structure is one that doesn't use any mutex locks. The implication is that multiple threads can access the data structure concurrently without race conditions or data corruption, even though there are no locks — people would give you funny looks if you suggested that std::list was a lock-free data structure, even though it is unlikely that there are any locks used in the implementation.

Just because more than one thread can safely access a lock-free data structure concurrently doesn't mean that there are no restrictions on such accesses. For example, a lock-free queue might allow one thread to add values to the back whilst another removes them from the front, whereas multiple threads adding new values concurrently would potentially corrupt the data structure. The data structure description will identify which combinations of operations can safely be called concurrently.

For a data structure to qualify as lock-free, if any thread performing an operation on the data structure is suspended at any point during that operation then the other threads accessing the data structure must still be able to complete their tasks. This is the fundamental restriction which distinguishes it from non-blocking data structures that use spin-locks or other busy-wait mechanisms.

Just because a data structure is lock-free it doesn't mean that threads don't have to wait for each other. If an operation takes more than one step then a thread may be pre-empted by the OS part-way through an operation. When it resumes the state may have changed, and the thread may have to restart the operation.

In some cases, a the partially-completed operation would prevent other threads performing their desired operations on the data structure until the operation is complete. In order for the algorithm to be lock-free, these threads must then either abort or complete the partially-completed operation of the suspended thread. When the suspended thread is woken by the scheduler it can then either retry or accept the completion of its operation as appropriate. In lock-free algorithms, a thread may find that it has to retry its operation an unbounded number of times when there is high contention.

If you use a lock-free data structure where multiple threads modify the same pieces of data and thus cause each other to retry then high rates of access from multiple threads can seriously cripple the performance, as the threads hinder each other's progress. This is why wait-free data structures are so important: they don't suffer from the same set-backs.

Definition of wait-free

A wait-free data structure is a lock-free data structure with the additional property that every thread accessing the data structure can make complete its operation within a bounded number of steps, regardless of the behaviour of other threads. Algorithms that can involve an unbounded number of retries due to clashes with other threads are thus not wait-free.

This property means that high-priority threads accessing the data structure never have to wait for low-priority threads to complete their operations on the data structure, and every thread will always be able to make progress when it is scheduled to run by the OS. For real-time or semi-real-time systems this can be an essential property, as the indefinite wait-periods of blocking or non-wait-free lock-free data structures do not allow their use within time-limited operations.

The downside of wait-free data structures is that they are more complex than their non-wait-free counterparts. This imposes an overhead on each operation, potentially making the average time taken to perform an operation considerably longer than the same operation on an equivalent non-wait-free data structure.

Choices

When choosing a data structure for a given task you need to think about the costs and benefits of each of the options.

A lock-based data structure is probably the easiest to use, reason about and write, but has the potential for limited concurrency. They may also be the fastest in low-load scenarios.

A lock-free (but not wait-free) data structure has the potential to allow more concurrent accesses, but with the possibility of busy-waits under high loads. Lock-free data structures are considerably harder to write, and the additional concurrency can make reasoning about the program behaviour harder. They may be faster than lock-based data structures, but not necessarily.

Finally, a wait-free data structure has the maximum potential for true concurrent access, without the possibility of busy waits. However, these are very much harder to write than other lock-free data structures, and typically impose an additional performance cost on every access.

Posted by Anthony Williams
[/ threading /] permanent link
Tags: , , , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the right.

12 Comments

Can you give an example of a lock-free structure that is not wait-free? This would make the distinction clearer.

by Anonymous at 12:52:17 on Monday, 25 January 2010

Sir, Would it be possible for you to give a example of wait-free and lock free data structure ?

thanks

by shankha at 05:11:12 on Tuesday, 09 February 2010
For the lock-free I can say the keyboard and most of input devices. for the wait-free hash tables is the best example.
by Emad at 01:58:54 on Thursday, 29 April 2010

The definition of lock-free that I've seen is that at every step, _some_ thread makes progress. That is, even if the OS suspends a thread while it has the data structure in an inconsistent state, another thread must be able to recover and keep going. This is weaker than wait-free since it doesn't require that each thread make progress. The simplest way to see the difference is:

int cas_increment(atomic<int>& a) { int old_value = a.load(); while(a.compare_exchange_weak(old_value, old_value + 1)) { } }

This is lock-free but not wait-free. Each execution of the CAS either succeeds or fails. If it succeeds, the executing thread has made progress. If it fails, that signals that some other thread successfully wrote into a and so made progress.

Your definition, on the other hand, allows spin-locks in "lock-free" structures, which seems to make the definition not so useful.

+1 to the rest of the post though.

by Jeffrey Yasskin at 07:07:50 on Wednesday, 19 May 2010

I've updated the post to make it clear that spin-locks are not lock-free.

by Anthony Williams at 21:40:28 on Tuesday, 07 September 2010

Great explanation.

Hope there are more to come.

by peterwkc at 08:37:56 on Thursday, 16 September 2010

HI..Thanks for your blog post.i found your blog, its really good.I particularly like the section........... Thanks for the nice post.<a href="http://www.impingesolutions.com">Custom Software development</a>

by Custom Software development at 05:59:17 on Wednesday, 08 December 2010

The data structure until the operation is complete. In order for the algorithm to be lock-free, these threads must then either abort or complete the partially-completed operation of the suspended thread. When the suspended thread is woken by the scheduler it can then either retry or accept the completion of its operation as appropriate.

by It recruitment at 07:36:51 on Friday, 04 January 2013
The potential to allow more concurrent accesses, but with the possibility of busy-waits under high loads. Lock-free data structures are considerably harder to write, and the additional concurrency can make reasoning about the program behaviour harder. They may be faster than lock-based data structures, but not necessarily.
by Political biographies at 07:17:25 on Tuesday, 12 February 2013

Thanks for the great summary, which helps clear things up for me. So, "lock free" only means mutexes (or anything else that can suspend a thread at the scheduler level)? For example, when testing std::atomic<bool>::is_lock_free(), my compiler returns true, but internally uses the x86 xchg atomic which will lock if operating on a memory address.

by Carl Cook at 11:10:16 on Tuesday, 01 October 2013

Just to add to my comment above, I took a look at the disassembly for an std::atomic<UDT>, where is_lock_free == false. It appears that on my compiler, there is a spinlock added prior to any call to atomic<UDT>.exchange().

by Carl Cook at 12:11:05 on Tuesday, 01 October 2013

After doing a bit more reading/experimenting, here's a better summary of my questions:

* Is it true that "lock-free" excludes any logical lock (ranging from mutexes through to entirely user-space spinlocks)... i.e. anything that will prevent the application from progressing if a critical thread gets scheduled out? * So, if a linked list implementation performed a compare and swap in order to perform node insertion, and the atomic compare and swap happened to not be lock free (e.g. it used a mutex for example), then the list can't be considered lock free? * Conversely, an atomic itself can be lock free even if it employs a memory bus lock at the instruction level (I imagine most do)

Thanks, Carl

by Carl Cook at 14:00:54 on Tuesday, 01 October 2013

Add your comment

Your name:

Your URL:

Email address:

Person or spambot?

Your comment: