Blog Archive for / 2010 /

November 2010 C++ Standards Committee Mailing

Tuesday, 07 December 2010

The November 2010 mailing for the C++ Standards Committee was published last week. This is the post-meeting mailing for the November 2010 committee meeting.

As well as the usual core and library issues lists, this mailing also includes an updated summary of the status of the FCD comments, along with a whole host of papers attempting to address some the remaining FCD comments and a new C++0x working draft.

To move or not to move

In my blog post on the October mailing, I mentioned that the implicit generation of move constructors was a big issue. I even contributed a paper with proposed wording for removing implicit move generation from the draft — with expert core wording guidance from Jason Merrill, this became N3216. My paper was just to give the committee something concrete to vote on — it doesn't matter how good your arguments are; if there isn't a concrete proposal for wording changes then the committee can't vote on it. In the end, the decision was that implicit move generation was a good thing, even though there was the potential for breaking existing code. However, the conditions under which move operations are implicitly generated have been tightened: the accepted proposal was N3203: Tightening the conditions for generating implicit moves by Jens Maurer, which provides wording for Bjarne's paper N3201: Moving Right Along. The proposal effectively treats copy, move and destruction as a group: if you specify any of them manually then the compiler won't generate any move operations, and if you specify a move operation then the compiler won't generate a copy. For consistency and safety, it would have been nice to prevent implicit generation of copy operations under the same circumstances, but for backwards compatibility this is still done when it would be done under C++03, though this is deprecated if the user specifies a destructor or only one of the copy operations.

Exceptions and Destructors

The second big issue from the October mailing was the issue of implicitly adding noexcept to destructors. In the end, the committee went for Jens Maurer's paper N3204: Deducing "noexcept" for destructors — destructors have the same exception specification they would have if they were implicitly generated, unless the user explicitly says otherwise. This will break code, but not much — the fraction of code that intentionally throws an exception from a destructor is small, and easily annotated with noexcept(false) to fix it.

Concurrency-related papers

There are 9 concurrency-related papers in this mailing, which I've summarised below. 8 of them were adopted at this meeting, and are now in the new C++0x working draft.

N3188 - Revision to N3113: Async Launch Policies (CH 36)

This paper is a revision of N3113 from the August mailing. It is a minor revision to the previous paper, which clarifies and simplifies the proposed wording.

It provides a clearer basis for implementors to supply additional launch policies for std::async, or for the committee to do so in a later revision of the C++ standard, by making the std::launch enum a bitmask type. It also drops the std::launch::any enumeration value, and renames std::launch::sync to std::launch::deferred, as this better describes what it means.

The use of a bitmask allows new values to be added which are either distinct values, or combinations of the others. The default policy for std::async is thus std::launch::async|std::launch::deferred.

N3191: C++ Timeout Specification

This paper is a revision of N3128: C++ Timeout Specification from the August mailing. It is a minor revision to the previous paper, which clarifies and simplifies the proposed wording.

There are several functions in the threading portion of the library that allow timeouts, such as the try_lock_for and try_lock_until member functions of the timed mutex types, and the wait_for and wait_until member functions of the future types. This paper clarifies what it means to wait for a specified duration (with the xxx_for functions), and what it means to wait until a specified time point (with the xxx_until functions). In particular, it clarifies what can be expected of the implementation if the clock is changed during a wait.

This paper also proposes replacing the old std::chrono::monotonic_clock with a new std::chrono::steady_clock. Whereas the only constraint on the monotonic clock was that it never went backwards, the steady clock cannot be adjusted, and always ticks at a uniform rate. This fulfils the original intent of the monotonic clock, but provides a clearer specification and name. It is also tied into the new wait specifications, since waiting for a duration requires a steady clock for use as a basis.

N3192: Managing C++ Associated Asynchronous State

This paper is a revision of N3129: Managing C++ Associated Asynchronous State from the August mailing. It is a minor revision to the previous paper, which clarifies and simplifies the proposed wording.

This paper tidies up the wording of the functions and classes related to the future types, and clarifies the management of the associated asynchronous state which is used to communicate e.g. between a std::promise and a std::future that will receive the result.

N3193: Adjusting C++ Atomics for C Compatibility

This paper is an update to N3164: Adjusting C++ Atomics for C Compatibility from the October mailing.

It drops the C compatibility header <stdatomic.h>, and the macro _Atomic, and loosens the requirements on the atomic_xxx types — they may be base classes of the associated atomic<T> specializations, or typedefs to them.

N3194: Clarifying C++ Futures

This paper is a revision ofN3170: Clarifying C++ Futures from the October mailing.

There were a few FCD comments from the US about the use of futures; this paper outlines all the issues and potential solutions. The proposed changes are actually fairly minor though:

future gains a share() member function for easy conversion to the corresponding shared_future type;
Accessing a shared_future for which valid() is false is now encouraged to throw an exception though it remains undefined behaviour;
atomic_future is to be removed;
packaged_task now has a valid() member function instead of operator bool for consistency with the future types.

A few minor changes have also been made to the wording to make things clearer.

N3196: Omnibus Memory Model and Atomics Paper

This paper is an update to N3125: Omnibus Memory Model and Atomics Paper from the August mailing.

This paper clarifies the wording surrounding the memory model and atomic operations.

N3197: Lockable Requirements for C++0x

This paper is an update to N3130: Lockable Requirements for C++0x from the October mailing. This is a minor revision reflecting discussions at Batavia.

This paper splits out the requirements for general lockable types away from the specific requirements on the standard mutex types. This allows the lockable concepts to be used to specify the requirements on a type to be used the the std::lock_guard and std::unique_lock class templates, as well as for the various overloads of the wait functions on std::condition_variable_any, without imposing the precise behaviour of std::mutex on user-defined mutex types.

N3209: Progress guarantees for C++0x (US 3 and US 186)(revised)

This paper is a revision ofN3152: Progress guarantees for C++0x (US 3 and US 186) from the October mailing. It is a minor revision to the previous paper, which extends the proposed wording to cover compare_exchange_weak as well as try_lock.

The FCD does not make any progress guarantees when multiple threads are used. In particular, writes made by one thread do not ever have to become visible to other threads, and threads aren't guaranteed ever to actually run at all. This paper looks at the issues and provides wording for minimal guarantees.

N3230: Constexpr Library Additions: future

This paper has not yet been accepted by the committee. It adds constexpr to the default constructors of future and shared_future, so that they can be statically initialized.

Other adopted papers

Of course, the committee did more than just address implicit move, exceptions in destructors and concurrency. The full minutes are available as N3212 in the mailing. Here is a quick summary of some of the other changes made:

Library functions that don't throw exceptions have been changed to use noexcept
The ratio arithmetic facilities have been changed to allow libraries to try and give the correct result if the result is representable, but the intermediate calculations may overflow (e.g. ratio_add<ratio<1,INTMAX_MAX>,ratio<1,INTMAX_MAX>>) (N3210)
New functions have been added to retrieve the current new handler, terminate handler or unexpected handler (N3189)
Alignment control is now done with the alignas keyword, rather than an attribute (N3190)
Virtual function override control is now done with keywords (including the first context sensitive keywords: override amd final) rather than attributes (N3206)

For the remaining changes, see the full minutes.

FCD comment status

The count of unresolved FCD comments is dropping rapidly, and now stands at 75 (out of 564 total), of which only 56 have any technical content. See N3224: C++ FCD Comment Status from the mailing for the full list.

Your comments

If you have any opinions on any of the papers listed here, or the resolution of any NB comments, please add them to the comments for this post.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: C++0x, C++, standards, concurrency
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Coming Soon: Just::Thread Pro

Friday, 29 October 2010

Multithreaded code doesn't have to be complicated.

That's the idea behind the Just::Thread Pro library. By providing a set of high level facilities in the library, your application code can be simplified — rather than spending your time on the complexities of multithreading and concurrency you can instead focus on what it is your application is trying to achieve.

Building on the Just::Thread C++0x thread library, Just::Thread Pro will provide facilities to:

Encapsulate communication between threads to avoid deadlocks and race conditions
Easily scale your application to make use of multi-core processors
Parallelize existing single-threaded code without a major rewrite

Just::Thread Pro will be available for all platforms supported by Just::Thread.

Head over to the Just::Thread Pro website and sign up to receive further news about the library and notification when it is released.

Posted by Anthony Williams
[/ news /] permanent link
Tags: concurrency, cplusplus, multithreading
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

October 2010 C++ Standards Committee Mailing

Thursday, 21 October 2010

The October 2010 mailing for the C++ Standards Committee was published earlier this week. This is the pre-meeting mailing for the November 2010 committee meeting.

As well as the usual core and library issues lists, this mailing also includes a Summary of the status of the FCD comments, along with a whole host of papers attempting to address some the remaining FCD comments.

To move or not to move

The big issue of the upcoming meeting is looking to be whether or not the compiler should implicitly generate move constructors and move assignment operators akin to the copy constructors and copy assignment operators that are currently auto generated. The wording in the FCD requires this, but people are concerned that this will break existing code when people start using their code with a C++0x compiler and library. There are two papers on the subject in the mailing: N3153: Implicit Move Must Go by Dave Abrahams, and N3174: To move or not to move by Bjarne Stroustrup.

There seems to be consensus among committee members that the FCD requires compilers to generate the move constructor and move assignment operator in cases that will break existing code. The key question is whether the breakage can be limited by restricting the cases in which the move members are implicitly generated, or whether implicit generation should be abandoned altogether. The various options are explained very clearly in the papers.

Exceptions and Destructors

N3166: Destructors default to noexcept is another potentially controversial issue. It is generally acknowledged that throwing exceptions from destructors is a bad idea, not least because this leads to termination if the destructor is invoked whilst the stack is being unwound due to another exception. Herb Sutter wrote about this way back in 1998 when the original C++ standard was hot off the presses, in GotW #47: Uncaught Exceptions.

The proposal in the paper comes from a Finnish comment on the FCD, and is quite simple: by default all destructors are assumed to be marked noexcept(true) (which is the new way of saying they cannot throw an exception, similar to an exception specification of throw()), unless they explicitly have a non-empty exception specification or are marked noexcept(false).

Since it is generally good practice not to throw from a destructor, you'd think this would be uncontroversial. Unfortunately it is not the case — there are currently situations where throwing from a destructor has defined behaviour, and even does exactly what people want. The example most frequently cited is the SOCI project for accessing databases from C++. This library provides an easy syntax for constructing SQL queries using the << operator. The operator builds a temporary object which executes the SQL in the destructor. If the SQL is invalid, or executing it causes an exception for any other reason then the destructor throws. Changing destructors to be noexcept(true) by default will make such code terminate on a database error unless the destructor is updated to declare that it can throw exceptions. Working code with defined behaviour is thus broken when recompiled with a C++0x compiler.

Concurrency-related papers

There are 3 concurrency-related papers in this mailing, which I've summarised below.

N3152: Progress guarantees for C++0x (US 3 and US 186)

N3164: Adjusting C++ Atomics for C Compatibility

This is an update to N3137 from the last mailing, which provides detailed wording updates for the required changes to regain compatibility with C1X atomics.

N3170: Clarifying C++ Futures

There were a few FCD comments from the US about the use of futures; this paper outlines all the issues and potential solutions. The proposed changes are actually fairly minor though:

future gains a share() member function for easy conversion to the corresponding shared_future type;
Accessing a shared_future for which valid() is false is now required to throw an exception rather than be undefined behaviour;
atomic_future is to be removed;

A few minor changes have also been made to the wording to make things clearer.

If you have any opinions on any of the papers listed here, or the resolution of any NB comments, please add them to the comments for this post.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: C++0x, C++, standards, concurrency
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

just::thread C++0x Thread Library V1.4.2 Released

Friday, 15 October 2010

I am pleased to announce that version 1.4.2 of just::thread, our C++0x Thread Library has just been released.

The big change with this release is the new support for gcc 4.5 on Ubuntu Linux. If you're running Ubuntu Lucid then you can get the .DEB files for gcc 4.5 from yesterday's blog post. For Ubuntu Maverick, gcc 4.5 is in the repositories.

Other changes:

Overflow in ratio arithmetic will now cause a compilation failure
Ratio arithmetic operations derive from the resulting std::ratio instantiation as well as providing the ::type member to better emulate the C++0x working draft
On Windows, just::thread can now be used in MFC DLLs

Purchase your copy and get started with the C++0x thread library now.

As usual, existing customers are entitled to a free upgrade to V1.4.2 from all earlier versions.

Posted by Anthony Williams
[/ news /] permanent link
Tags: multithreading, concurrency, C++0x
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

gcc 4.5 Packages for Ubuntu Lucid

Thursday, 14 October 2010

Ubuntu Maverick was released earlier this week. Amongst other things, gcc 4.5 is available in the repositories, whereas for previous versions you had to build it yourself from source.

In order to save you the pain of compiling gcc 4.5 for yourself (which can take a while, and overheated my laptop when I tried), I've built it for Ubuntu Lucid, and uploaded the .deb files to my website. The .debs are built from the Maverick source packages for gcc 4.5.1, binutils 2.20.51, cloog-ppl and mpclib, and I've built them for both i386 and amd64 architectures.

Enjoy!

Posted by Anthony Williams
[/ news /] permanent link
Tags: gcc, lucid, ubuntu
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Concept Checking Without C++0x Concepts

Wednesday, 06 October 2010

My latest article, Concept Checking Without Concepts in C++ was published on the Dr Dobb's website a couple of weeks ago.

One of the important features of the now-defunct C++0x Concepts proposal was the ability to overload functions based on whether or not their arguments met certain concepts. This article describes a way to allow that for concepts based on the presence of particular member functions.

The basic idea is that you can write traits classes that detect particular sets of member functions. Function overloads that require these concepts can then be enabled or disabled by using std::enable_if with these traits.

The example I use is checking for a Lockable type which has lock(), unlock() and try_lock() member functions, but the same technique could easily be used for other concepts that required other member functions.

Read the article for the full details.

Posted by Anthony Williams
[/ news /] permanent link
Tags: concepts, cplusplus
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

August 2010 C++ Standards Committee Mailing

Wednesday, 08 September 2010

The August 2010 mailing for the C++ Standards Committee was published recently. This is the post-meeting mailing for the August 2010 committee meeting, and contains a new C++0x Working Draft. At the meeting in August, the committee discussed many of the National Body comments on the FCD, and this draft incorporates those changes that the committee approved of. As you can see from the FCD Comment Status document in this mailing, there were 301 technical comments and a further 215 editorial comments. Of these, 98 technical comments have been accepted as-is, 8 have been accepted with changes, and 63 have been rejected, leaving 132 technical comments that have still not been addressed one way or the other.

No significant changes have been accepted to the concurrency-related parts of the working draft, though there are quite a few editorial comments. However, there are several papers in this mailing that address the National Body comments in this area. These papers have by and large been drafted to represent the consensus of those members of the concurreny group in the LWG who were present at the meeting. I have summarised these papers below.

Concurrency-related papers

N3113: Async Launch Policies (CH 36)

This paper provides a clearer basis for implementors to supply additional launch policies for std::async, or for the committee to do so in a later revision of the C++ standard, by making the std::launch enum a bitmask type. It also drops the std::launch::any enumeration value, and renames std::launch::sync to std::launch::deferred, as this better describes what it means.

N3125: Omnibus Memory Model and Atomics Paper

This paper addresses several National Body comments by updating the wording in the draft standard to better reflect the intent of the committee.

N3128: C++ Timeout Specification

N3129: Managing C++ Associated Asynchronous State

N3130: Lockable requirements for C++0x

N3132: Mathematizing C++ Concurrency: The Post-Rapperswil Model

This paper provides a mathematical description for the C++0x memory model. A similar description was used to highlight some of the areas that are clarified by the omnibus memory model paper (N3125) described above.

N3136: Coherence Requirements Detailed

This paper introduces some simple coherence requirements to the memory model wording to make it clear that the sequence of values read for a given variable must be consistent across threads. The existence of a single modification order for each variable is a key component of the memory model, and the wording introduced in this paper makes it clear that this is a core requirement.

N3137: C and C++ Liaison: Compatibility for Atomics

The structure of the atomic types and operations in the FCD was carefully worked out in conjunction with the C standards committee to ensure that the C++0x atomic types were compatible with those being introduced in the upcoming C1x standard. Unfortunately, the C committee introduced a new incompatible syntax for atomic types into the C1x draft earlier this year because they believed it was a better match for the C language.

This paper attempts to address this new incompatibility by removing the atomic_xxx types that were originally added for C compatibility, leaving just the std::atomic<T> class template. Also, a new _Atomic(T) macro is introduced for compatibility with the new C1x _Atomic keyword.

Other papers

As already mentioned, this mailing contains a new C++0x Working Draft, along with the usual post-meeting stuff — editors notes for the changes in the new draft, new issues lists, minutes of the meeting, etc. It also contains a complete list of the National Body Comments on the FCD, and a few other papers addressing National Body comments.

If you have any opinions on the resolution of any NB comments not yet formally accepted or rejected, please add them to the comments for this post.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: C++0x, C++, standards, concurrency
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Definitions of Non-blocking, Lock-free and Wait-free

Tuesday, 07 September 2010

There have repeatedly been posts on comp.programming.threads asking for a definition of these terms. To write good multithreaded code you really need to understand what these mean, and how they affect the behaviour and performance of algorithms with these properties. I thought it would be therefore be useful to provide some definitions.

Definition of Blocking

A function is said to be blocking if it calls an operating system function that waits for an event to occur or a time period to elapse. Whilst a blocking call is waiting the operating system can often remove that thread from the scheduler, so it takes no CPU time until the event has occurred or the time has elapsed. Once the event has occurred then the thread is placed back in the scheduler and can run when allocated a time slice. A thread that is running a blocking call is said to be blocked.

Mutex lock functions such as std::mutex::lock(), and EnterCriticalSection() are blocking, as are wait functions such as std::future::wait() and std::condition_variable::wait(). However, blocking functions are not limited to synchronization facilities: the most common blocking functions are I/O facilities such as fread() or WriteFile(). Timing facilities such as Sleep(), or std::this_thread::sleep_until() are also often blocking if the delay period is long enough.

Definition of Non-blocking

Non-blocking functions are just those that aren't blocking. Non-blocking data structures are those on which all operations are non-blocking. All lock-free data structures are inherently non-blocking.

Spin-locks are an example of non-blocking synchronization: if one thread has a lock then waiting threads are not suspended, but must instead loop until the thread that holds the lock has released it. Spin locks and other algorithms with busy-wait loops are not lock-free, because if one thread (the one holding the lock) is suspended then no thread can make progress.

Defintion of lock-free

A lock-free data structure is one that doesn't use any mutex locks. The implication is that multiple threads can access the data structure concurrently without race conditions or data corruption, even though there are no locks — people would give you funny looks if you suggested that std::list was a lock-free data structure, even though it is unlikely that there are any locks used in the implementation.

Just because more than one thread can safely access a lock-free data structure concurrently doesn't mean that there are no restrictions on such accesses. For example, a lock-free queue might allow one thread to add values to the back whilst another removes them from the front, whereas multiple threads adding new values concurrently would potentially corrupt the data structure. The data structure description will identify which combinations of operations can safely be called concurrently.

For a data structure to qualify as lock-free, if any thread performing an operation on the data structure is suspended at any point during that operation then the other threads accessing the data structure must still be able to complete their tasks. This is the fundamental restriction which distinguishes it from non-blocking data structures that use spin-locks or other busy-wait mechanisms.

Just because a data structure is lock-free it doesn't mean that threads don't have to wait for each other. If an operation takes more than one step then a thread may be pre-empted by the OS part-way through an operation. When it resumes the state may have changed, and the thread may have to restart the operation.

In some cases, a the partially-completed operation would prevent other threads performing their desired operations on the data structure until the operation is complete. In order for the algorithm to be lock-free, these threads must then either abort or complete the partially-completed operation of the suspended thread. When the suspended thread is woken by the scheduler it can then either retry or accept the completion of its operation as appropriate. In lock-free algorithms, a thread may find that it has to retry its operation an unbounded number of times when there is high contention.

If you use a lock-free data structure where multiple threads modify the same pieces of data and thus cause each other to retry then high rates of access from multiple threads can seriously cripple the performance, as the threads hinder each other's progress. This is why wait-free data structures are so important: they don't suffer from the same set-backs.

Definition of wait-free

A wait-free data structure is a lock-free data structure with the additional property that every thread accessing the data structure can make complete its operation within a bounded number of steps, regardless of the behaviour of other threads. Algorithms that can involve an unbounded number of retries due to clashes with other threads are thus not wait-free.

This property means that high-priority threads accessing the data structure never have to wait for low-priority threads to complete their operations on the data structure, and every thread will always be able to make progress when it is scheduled to run by the OS. For real-time or semi-real-time systems this can be an essential property, as the indefinite wait-periods of blocking or non-wait-free lock-free data structures do not allow their use within time-limited operations.

The downside of wait-free data structures is that they are more complex than their non-wait-free counterparts. This imposes an overhead on each operation, potentially making the average time taken to perform an operation considerably longer than the same operation on an equivalent non-wait-free data structure.

Choices

When choosing a data structure for a given task you need to think about the costs and benefits of each of the options.

A lock-based data structure is probably the easiest to use, reason about and write, but has the potential for limited concurrency. They may also be the fastest in low-load scenarios.

A lock-free (but not wait-free) data structure has the potential to allow more concurrent accesses, but with the possibility of busy-waits under high loads. Lock-free data structures are considerably harder to write, and the additional concurrency can make reasoning about the program behaviour harder. They may be faster than lock-based data structures, but not necessarily.

Finally, a wait-free data structure has the maximum potential for true concurrent access, without the possibility of busy waits. However, these are very much harder to write than other lock-free data structures, and typically impose an additional performance cost on every access.

Posted by Anthony Williams
[/ threading /] permanent link
Tags: concurrency, threading, multithreading, lock-free, wait-free
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

just::thread C++0x Thread Library V1.4.1 Released

Monday, 09 August 2010

I am pleased to announce that version 1.4.1 of just::thread, our C++0x Thread Library has just been released.

Thisis an improvement over V1.4.0 in a number of areas:

Both /Zc:wchar_t and /Zc:wchar_t- are supported with MSVC
std::chrono::high_resolution_clock typedef added
Added support for shared libraries on Linux
Faster mutex locking and unlocking on contended mutexes on Linux
Faster blocking/unblocking for condition variables on Linux
Support for tracking clock changes when waiting on a std::chrono::system_clock time with std::condition_variable on Linux with kernels >= 2.6.31
Support for floating-point durations
Faster time retrieval with std::chrono::monotonic_clock::now() on Windows
Added support for Microsoft Visual Studio 2005

Purchase your copy and get started with the C++0x thread library now.

As usual, existing customers are entitled to a free upgrade to V1.4.1 from all earlier versions.

Posted by Anthony Williams
[/ news /] permanent link
Tags: multithreading, concurrency, C++0x
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Implementing Dekker's algorithm with Fences

Tuesday, 27 July 2010

Dekker's algorithm is one of the most basic algorithms for mutual exclusion, alongside Peterson's algorithm and Lamport's bakery algorithm. It has the nice property that it only requires load and store operations rather than exchange or test-and-set, but it still requires some level of ordering between the operations. On a weakly-ordered architecture such as PowerPC or SPARC, a correct implementation of Dekker's algorithm thus requires the use of fences or memory barriers in order to ensure correct operation.

The code

For those of you who just want the code: here it is — Dekker's algorithm in C++, with explicit fences.

std::atomic<bool> flag0(false),flag1(false);
std::atomic<int> turn(0);

void p0()
{
    flag0.store(true,std::memory_order_relaxed);
    std::atomic_thread_fence(std::memory_order_seq_cst);

    while (flag1.load(std::memory_order_relaxed))
    {
        if (turn.load(std::memory_order_relaxed) != 0)
        {
            flag0.store(false,std::memory_order_relaxed);
            while (turn.load(std::memory_order_relaxed) != 0)
            {
            }
            flag0.store(true,std::memory_order_relaxed);
            std::atomic_thread_fence(std::memory_order_seq_cst);
        }
    }
    std::atomic_thread_fence(std::memory_order_acquire);
 
    // critical section


    turn.store(1,std::memory_order_relaxed);
    std::atomic_thread_fence(std::memory_order_release);
    flag0.store(false,std::memory_order_relaxed);
}

void p1()
{
    flag1.store(true,std::memory_order_relaxed);
    std::atomic_thread_fence(std::memory_order_seq_cst);

    while (flag0.load(std::memory_order_relaxed))
    {
        if (turn.load(std::memory_order_relaxed) != 1)
        {
            flag1.store(false,std::memory_order_relaxed);
            while (turn.load(std::memory_order_relaxed) != 1)
            {
            }
            flag1.store(true,std::memory_order_relaxed);
            std::atomic_thread_fence(std::memory_order_seq_cst);
        }
    }
    std::atomic_thread_fence(std::memory_order_acquire);
 
    // critical section


    turn.store(0,std::memory_order_relaxed);
    std::atomic_thread_fence(std::memory_order_release);
    flag1.store(false,std::memory_order_relaxed);
}

The analysis

If you're like me then you'll be interested in why stuff works, rather than just taking the code. Here is my analysis of the required orderings, and how the fences guarantee those orderings.

Suppose thread 0 and thread 1 enter p0 and p1 respectively at the same time. They both set their respective flags to true, execute the fence and then read the other flag at the start of the while loop.

If both threads read false then both will enter the critical section, and the algorithm doesn't work. It is the job of the fences to ensure that this doesn't happen.

The fences are marked with memory_order_seq_cst, so either the fence in p0 is before the fence in p1 in the global ordering of memory_order_seq_cst operations, or vice-versa. Without loss of generality, we can assume that the fence in p0 comes before the fence in p1, since the code is symmetric. The store to flag0 is sequenced before the fence in p0, and the fence in p1 is sequenced before the read from flag0. Therefore the read from flag0 must see the value stored (true), so p1 will enter the while loop.

On the other side, there is no such guarantee for the read from flag1 in p0, so p0 may or may not enter the while loop. If p0 reads the value of false for flag1, it will not enter the while loop, and will instead enter the critical section, but that is OK since p1 has entered the while loop.

Though flag0 is not set to false until p0 has finished the critical section, we need to ensure that p1 does not see this until the values modified in the critical section are also visible to p1, and this is the purpose of the release fence prior to the store to flag0 and the acquire fence after the while loop. When p1 reads the value false from flag0 in order to exit the while loop, it must be reading the value store by p0 after the release fence at the end of the critical section. The acquire fence after the load guarantees that all the values written before the release fence prior to the store are visible, which is exactly what we need here.

If p0 reads true for flag1, it will enter the while loop rather than the critical section. Both threads are now looping, so we need a way to ensure that exactly one of them breaks out. This is the purpose of the turn variable. Initially, turn is 0, so p1 will enter the if and set flag1 to false, whilst p1 will not enter the if. Because p1 set flag1 to false, eventually p0 will read flag1 as false and exit the outer while loop to enter the critical section. On the other hand, p1 is now stuck in the inner while loop because turn is 0. When p0 exits the critical section it sets turn to 1. This will eventually be seen by p1, allowing it to exit the inner while loop. When the store to flag0 becomes visible p1 can then exit the outer while loop too.

If turn had been 1 initially (because p0 was the last thread to enter the critical section) then the inverse logic would apply, and p0 would enter the inner loop, allowing p1 to enter the critical section first.

Second time around

If p0 is called a second time whilst p1 is still in the inner loop then we have a similar situation to the start of the function — p1 may exit the inner loop and store true in flag1 whilst p0 stores true in flag0. We therefore need the second memory_order_seq_cst fence after the store to the flag in the inner loop. This guarantees that at least one of the threads will see the flag from the other thread set when it executes the check in the outer loop. Without this fence then both threads can read false, and both can enter the critical section.

Alternatives

You could put the ordering constraints on the loads and stores themselves rather than using fences. Indeed, the default memory ordering for atomic operations in C++ is memory_order_seq_cst, so the algorithm would "just work" with plain loads and stores to atomic variables. However, by using memory_order_relaxed on the loads and stores we can add fences to ensure we have exactly the ordering constraints required.

Posted by Anthony Williams
[/ threading /] permanent link
Tags: concurrency, synchronization, Dekker, fences, cplusplus
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Reference Wrappers Explained

Wednesday, 14 July 2010

The upcoming C++0x standard includes reference wrappers in the form of the std::reference_wrapper<T> class template, and the helper function templates std::ref() and std::cref(). As I mentioned in my blog post on Starting Threads with Member Functions and Reference Arguments, these wrappers can be used to pass references to objects across interfaces that normally require copyable (or at least movable) objects — in that blog post, std::ref was used for passing references to objects over to the new thread, rather than copying the objects. I was recently asked what the difference was between std::ref and std::cref, and how they worked, so I thought I'd elaborate.

Deducing the Referenced Type

std::ref is a function template, so automatically deduces the type of the wrapped reference from the type of the supplied argument. This type deduction includes the const-ness of the supplied object:

int x=3;
const int y=4;
std::reference_wrapper<int> rx=std::ref(x);
// std::reference_wrapper<int> ry=std::ref(y); // error
std::reference_wrapper<const int> rcy=std::ref(y);

On the other hand, though std::cref also deduces the type of the wrapped reference from the supplied argument, it always wraps a const reference:

int x=3;
const int y=4;
// std::reference_wrapper<int> rx=std::cref(x); // error
std::reference_wrapper<const int> rcx=std::cref(x);
// std::reference_wrapper<int> ry=std::cref(y); // error
std::reference_wrapper<const int> rcy=std::cref(y);

Since a no-const-reference can always be bound to a const reference, you can thus use std::ref in pretty much every case where you would use std::cref, and your code would work the same. Which begs the question: why would you ever choose to use std::cref?

Using `std::cref` to prevent modification

The primary reason for choosing std::cref is because you want to guarantee that the source object is not modified through that reference. This can be important when writing multithreaded code — if a thread should not be modifying some data then it can be worth enforcing this by passing a const reference rather than a mutable reference.

void foo(int&); // mutable reference

int x=42; // Should not be modified by thread
std::thread t(foo,std::cref(x)); // will fail to compile

This can be important where there are overloads of a function such that one takes a const reference, and the other a non-const reference: if we don't want the object modified then it is important that the overload taking a const reference is chosen.

struct Foo
{
    void operator()(int&) const;
    void operator()(int const&) const;
};

int x=42;
std::thread(Foo(),std::cref(x)); // force const int& overload

References to temporaries

std::cref has another property missing from std::ref — it can bind to temporaries, since temporary objects can bind to const references. I'm not sure this is a good thing, especially when dealing with multiple threads, as the referenced temporary is likely to have been destroyed before the thread has even started. This is therefore something to watch out for:

void bar(int const&);

std::thread t(bar,std::cref(42)); // oops, ref to temporary

Documentation

Finally, std::cref serves a documentation purpose, even where std::ref would suffice — it declares in big bold letters that this reference cannot be used to modify the referenced object, which thus makes it easier to reason about the code.

Recommendation

I would recommend that you use std::cref in preference to std::ref whenever you can — the benefits as documentation of intent, and avoiding accidental modification through the reference make it a clear winner in my opinion. Of course, if you do want to modify the referenced object, then you need to use std::ref, but such usage now stands out, and makes it clear that this is the intent.

You do still need to be careful to ensure that you don't try and wrap references to temporaries, particularly when applying std::cref to the result of a function call, but such uses should stand out — I expect most uses to be wrapping a reference to a named variable rather than wrapping a function result.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: reference wrappers, ref, cref, cplusplus
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Last day for comments on the C++0x FCD

Thursday, 17 June 2010

The BSI deadline for comments on the C++0x FCD is tomorrow, Friday 18th June 2010. The ISO deadline is 26th July 2010, but we have to write up comments for submission in the form required for ISO, which takes time.

If you have a comment on the FCD, please see my earlier blog post for how to submit it to BSI. Help us make the C++0x standard as good as it can be.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: C++, C++0x, WG21, FCD
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Enforcing Correct Mutex Usage with Synchronized Values

Friday, 28 May 2010

My latest article, Enforcing Correct Mutex Usage with Synchronized Values has been published on the Dr Dobb's website.

This article expands on the SynchronizedValue<T> template I mentioned in my presentation on Concurrency in the Real World at ACCU 2010, and deals with the problem of ensuring that the mutex associated with some data is locked whenever the data is accessed.

The basic idea is that you use SynchronizedValue<T> wherever you have an object of type T that you wish to be protected with its own mutex. The SynchronizedValue<T> then behaves like a pointer-to-T for simple uses.

Read the article for the full details.

Posted by Anthony Williams
[/ news /] permanent link
Tags: mutex, cplusplus
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

just::thread C++0x Thread Library V1.4 (FCD Edition) Released

Thursday, 06 May 2010

I am pleased to announce that version 1.4 (the FCD edition) of just::thread, our C++0x Thread Library has just been released.

With the release of the "FCD edition", just::thread provides the first complete implementation of the multithreading facilities from the Final Committee Draft (FCD) of the C++0x standard.

Changes include:

New promise::set_value_at_thread_exit, promise::set_exception_at_thread_exit, and packaged_task::make_ready_at_thread_exit member functions to defer unblocking waiting threads until the notifying thread exits
New notify_all_at_thread_exit function for notifying condition variables when the notifying thread exits
The wait_for and wait_until member functions of future, shared_future and atomic_future return a future_status enum rather than bool to indicate whether the future is ready, the wait timed out, or the future contains a deferred async function
The destructor of the last future associated with an async function waits for that function to complete.
New ATOMIC_VAR_INIT macro for initializing atomic objects
The callable object for a packaged_task is destroyed with the packaged_task rather than being kept alive until the future is destroyed

Purchase your copy and get started with the C++0x thread library NOW.

As usual, existing customers are entitled to a free upgrade to V1.4.0 from all earlier versions.

Posted by Anthony Williams
[/ news /] permanent link
Tags: multithreading, concurrency, C++0x
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

"Concurrency in the Real World" slides now available

Monday, 19 April 2010

The slides for my presentation on "Concurrency in the Real World" at the ACCU 2010 conference last week are now available.

The room was full, and quite warm due to the air conditioning having been turned off, but everything went to plan, and there were some insightful questions from the audience. I've thoroughly enjoyed presenting at ACCU in previous years, and this was no exception.

I covered the main pitfalls people encounter when writing multithreaded code, along with some techniques that I've found help deal with those problems, including some example code from projects I've worked on. As you might expect, all my examples were in C++, though the basic ideas are cross-language. I finished up by talking about what we might hope to get out of multithreaded code, such as performance, additional features and responsiveness.

There's a discount on my just::thread library until Friday 23rd April 2010, so if you're doing concurrency in C++ with Microsoft Visual Studio on Windows or g++ on linux get yourself a copy whilst it's on offer and start taking advantage of the new C++0x thread library.

Posted by Anthony Williams
[/ news /] permanent link
Tags: concurrency, multithreading, c++, ACCU
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

March 2010 C++ Standards Committee Mailing

Thursday, 08 April 2010

The March 2010 mailing for the C++ Standards Committee was published last week. This is the post-meeting mailing for the March 2010 committee meeting, and contains the C++0x Final Committee Draft, which I blogged about last week.

There are 6 concurrency-related papers (of which my name is on two), which I summarize below:

Concurrency-related papers

N3057: Explicit Initializers for Atomics

This paper proposes new initializers for atomic variables, providing a means of writing code which can be compiled as either C or C++. e.g.

void foo()
{
    atomic_int a=ATOMIC_VAR_INIT(42); // initialize a with 42
    atomic_uint b;                    // uninitialized
    atomic_init(&b,123);          // b now initialized to 123
}

N3058: Futures and Async Cleanup (Rev.)

This is a revision of N3041 to resolve many of the outstanding issues with futures and async. Mostly it's just wordsmithing to tidy up the specification, but there's a few key changes:

Defined behaviour for the wait_for() and wait_until() member functions of std::future, std::shared_future and std::atomic_future when used with std::async and a launch policy of std::launch::sync. The return value is now a value of the new std::future_status enumeration, and can be std::future_status::ready if the future becomes ready before the timeout, std::future_status::timeout if the wait times out, or std::future_status::deferred if the future comes from a call to std::async with a launch policy of std::launch::sync and the function associated with the future hasn't yet started execution on any thread.
The wording for std::async adopts the same wording as std::thread to clarify the copy/move and perfect forwarding semantics of the call.

N3069: Various threads issues in the library (LWG 1151)

This is a revision of N3040, and highlights which operations through iterators constitute accesses and data races, and explicitly allows for synchronization by writing and reading to/from a stream.

N3070: Handling Detached Threads and thread_local Variables

This is a hugely simplified replacement for my previous paper N3038. Rather than creating contexts for thread_local variables, this paper proposes new member functions for std::promise and std::packaged_task to allow the value to be set at the point of call, but threads waiting on associated futures to be woken only after thread_local variables have been destroyed at thread exit. This means that you can now safely wait on a future which is set in such a fashion when waiting for a task running on a background thread to complete, without having to join with the thread or worry about races arising from the destructors of thread_local variables. The paper also adds a similar mechanism for condition variables as a non-member function.

N3071: Renaming launch::any and what asyncs really might be (Rev.)

This is a revision of N3042 proposing renaming std::launch::any to std::launch::sync_or_async. This paper was not approved.

N3074: Updates to C++ Memory Model Based on Formalization

This is a revision of N3045. This paper proposes some changes to the wording of the memory model in order to ensure that it means what we intended it to mean.

Other Papers

There's several non-concurrency papers in the mailing as well as the standard set (working draft, agenda, issues lists, etc.). The most significant of these in my view are the following 3 papers. Check the mailing for the full set.

N3050: Allowing Move Constructors to Throw (Rev. 1)

This paper adds the new noexcept keyword to C++. This is used in place of an exception specification. On its own it means that the function does not throw any exceptions, but it can also be used with a boolean constant expression where true means that the function doesn't throw, and false means that it might. e.g.

void foo() noexcept;        // will not throw
void bar() noexcept(true);  // will not throw
void baz() noexcept(false); // may throw

If a noexcept exception specification is violated then std::terminate() is called.

The primary benefit from the boolean-constant-expression version is in templates, where the boolean expression can check various properties of the template parameter types. One of the things you can check is whether or not particular operations throw, e.g. by using the new has_nothrow_move_constructor type trait to declare the move constructor for a class to be noexcept if its class members have non-throwing move constructors:

template<typename T>
class X
{
    T data;
public:
    X(X&& other)
        noexcept(std::has_nothrow_move_constructor<T>::value):
        data(std::move(other.data))
    {}
};

N3053: Defining Move Special Member Functions

This proposal ensures that user-defined classes have move constructors and move assignment operators generated for them by the compiler if that is safe. Explicitly declaring a copy or move constructor will prevent the implicit declaration of the other, and likewise for copy and move assignment. You can always request the default definition using the = default syntax.

This means that lots of user code will now be able to readily take advantage of move semantics with a simple code change or even just a recompile. This can potentially be of major performance benefit.

N3055: A Taxonomy of Expression Value Categories

This paper nails down the true distinctions between lvalues, rvalues and rvalue references. It provides a new set of names to identify the distinct categories of values in C++ — lvalues and rvalues we already have, but now there's xvalues, prvalues and glvalues too. This categorization allows for better specification of when things can bind to lvalue references or rvalue references, when the compiler can eliminate copies or moves.

Please comment on the FCD

The purpose of the C++0x Final Committee Draft is to get comments prior to publication to ensure the final C++0x standard is as defect free as possible. This opportunity is only available for a limited time, so please comment on the FCD.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: C++0x, C++, standards, concurrency
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Sign up for a 50% discount just::thread FCD edition

Wednesday, 07 April 2010

I'm in the process of updating our C++0x thread library for VS2008, VC10, g++ 4.3 and g++ 4.4 to incorporate the changes to the C++0x thread library voted into the C++0x FCD. I'll be writing a blog post with more details in due course, but the big changes are:

Functions for postponing notification of threads waiting on a std::future until the thread that set the value on the std::promise or ran the std::packaged_task has exited.
A similar facility for notifying a std::condition_variable at thread exit.
Defined behaviour for the wait_for() and wait_until() member functions of std::future when used with std::async and a launch policy of std::launch::sync.
Changes to the initialization of atomic variables.

Existing customers will get the new version as a free upgrade, but the rest of you can get a 50% discount if you subscribe to my blog by email. Just fill in your name and email address in the form below and be sure to click the confirmation link. You'll then receive future blog posts by email, along with an announcement and exclusive discount for the FCD edition of just::thread when it's released.

If you're reading this via RSS and your reader doesn't show you the form or doesn't allow you to submit your details, then please go to the web version of this blog entry.

If you've already subscribed by email then you don't need to subscribe again, you'll automatically receive the discount code.

Posted by Anthony Williams
[/ news /] permanent link
Tags: concurrency, threading, C++0x, just::thread
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

C++0x Final Committee Draft Published - Please Comment

Friday, 02 April 2010

Earlier this week, the Final Committee Draft (FCD) of the C++0x standard was published. This means that C++0x is now in the final stages of bug fixing and wordsmithing before publication. If all goes to plan, the draft will move to Final Draft International Standard (FDIS) early in 2011, and will be a new standard by the end of 2011.

The publication of the FCD means that the draft standard has now been officially put up for review by the national standards bodies of ISO's member countries. The British Standards Institution is one of several national bodies that is actively involved in the standardisation of the C++ language. The panel members of the C++ Committee of the BSI, IST 5/-/21, are currently compiling a list of comments on the FCD. We intend to submit these as the BSI's National Body comments, aimed at getting issues with the FCD addressed before it becomes the new international standard for C++.

We're welcoming additional comments, and would like to provide a channel for anyone who may be interested in the C++0x Standard, but not able to be fully involved in the standards process, to submit comments. Note that not all comments — regardless of whether they are submitted by panel members or non-members — will go forward.

Here is some guidance on what we are looking for:

Suggestions for how to improve the clarity of the wording, even if that's just by adding a cross-reference to a relevant paragraph elsewhere;
Comments that identify any under/over specification; and
Comments highlighting inconsistencies or contradictions in the draft text.

Comments should be specific and preferably should include suggested updated wording (and if you need help formulating updated wording we can provide it, within reason) — the C++ standards committee is working to a very tight schedule in order to get C++0x out as soon as possible, and comments without wording (which therefore require more work from the committee) are more likely to be rejected.

The time for adding/removing features has now passed, so comments should focus on improving the draft as it stands rather than suggesting new features.

Owing to the time scale for submission to BSI and ISO, comments need to be submitted by Friday 18th June 2010.

If you have any comments, feel free to post them in the comment section of this blog entry, or email them to me. I will forward all appropriate suggestions to the rest of the BSI panel (whether or not I agree with them).

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: C++, C++0x, WG21, FCD
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

just::thread C++0x Thread Library V1.3.2 Released

Thursday, 25 March 2010

I am pleased to announce that version 1.3.2 of just::thread, our C++0x Thread Library has just been released.

This release is the first to feature support for the Microsoft Visual Studio 2010 RC for both 32-bit and 64-bit Windows.

There are also a few minor fixes to the future classes, and a new implementation of mutexes and condition variables on linux with lower overhead.

Purchase your copy and get started with the C++0x thread library NOW.

As usual, existing customers are entitled to a free upgrade to V1.3.2 from all earlier versions.

Posted by Anthony Williams
[/ news /] permanent link
Tags: multithreading, concurrency, C++0x
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

February 2010 C++ Standards Committee Mailing

Tuesday, 23 February 2010

The February 2010 mailing for the C++ Standards Committee was published last week. This is the pre-meeting mailing for the March 2010 committee meeting and contains a new working draft.

There are 5 concurrency-related papers (of which my name is on one), which I summarize below:

Concurrency-related papers

N3038: Managing the lifetime of thread_local variables with contexts (Revision 2)

This is my paper on creating contexts for thread_local variables. The use of such contexts allows you to control when variables that are declared as thread_local are destroyed. It is a revision of my previous paper N2959; the primary change is that contexts can now be nested, which allows library code to use them without having to know whether or not a context is currently active.

N3040: Various threads issues in the library (LWG 1151)

This paper by Hans Boehm seeks to address LWG issue 1151. The key issue is to ensure that it is clear which operations may constitute a data race if they run concurrently without synchronization.

N3041: Futures and Async Cleanup

The adoption of multiple papers affecting futures and std::async at the same C++ committee meeting meant that the wording ended up being unclear. Detlef Vollmann kindly volunteered to write a paper to resolve these issues, and this is it.

Unfortunately, I think that some of the wording is still unclear. I also dislike Detlef's proposal to force the wait_for and wait_until member functions of the future types to throw exceptions if the future was created from a call to std::async with a launch policy of std::launch::sync. My preferred alternative is to change the return type from bool to an enumeration with distinct values for if the future is ready, if the wait timed out, or if the future holds a deferred function from std::launch::sync that has not yet started. This would be similar to the current behaviour of std::condition_variable::wait_for and std::condition_variable::wait_until, which return a std::cv_status enumeration value.

N3042: Renaming launch::any and what asyncs really might be

This is another paper from Detlef Vollmann proposing renaming std::launch::any to std::launch::any_sync. His rationale is that future revisions of the C++ standard may wish to add values to the std::launch enumeration for additional types of async calls that should not be covered by std::launch::any. Personally, I think this is a non-issue, and should be covered as and when such values are added.

N3045: Updates to C++ Memory Model Based on Formalization

Following attempts to create a mathematical formalization of the memory model it became clear that some cases were unclear or ambiguous or did not guarantee the desired semantics. This paper proposes some changes to the wording of the memory model in order to ensure that it means what we intended it to mean.

Other Papers

There's several non-concurrency papers in the mailing as well as the standard set (working draft, agenda, issues lists, etc.). The most significant of these in my view is N3044 which proposes compiler-defined move constructors and assignment operators. Check the mailing for the full set.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: C++0x, C++, standards, concurrency
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

The difference between struct and class in C++

Sunday, 21 February 2010

I've seen a lot of people asking about the differences between the use of the struct and class keywords in C++ lately. I don't know whether there's an influx of C++ programmers due to the upcoming C++0x standard, or whether I've just noticed people asking questions that haven't caught my eye before. Whatever the reason, I'm writing this blog entry as something I can point to the next time someone asks the question.

Declaring and defining user-defined types

The primary use of both the struct and class keywords is to define a user-defined type. In C++, such a user-defined type is termed a "class" regardless of which keyword is used in the definition. The choice of keyword is in one sense arbitrary, since the same features and facilities are available whichever keyword is used — there is only one semantic difference which we shall look at shortly. The following two class definitions are thus equivalent in all respects apart from the names of the classes:

struct type_a
{
private:
    int data;
public:
    type_a(int data_):
        data(data_)
    {}
    virtual void foo()=0;
    virtual ~type_a()
    {}
};

class type_b
{
private:
    int data;
public:
    type_b(int data_):
        data(data_)
    {}
    virtual void foo()=0;
    virtual ~type_b()
    {}
};

As this little example shows, you can have constructors, destructors, member functions, private members and even virtual member functions in a class declared with the struct keyword, just as you can with a class declared using the class keyword. Though this example doesn't show it, you can also use the struct keyword to declare classes with base classes.

You can even forward-declare your class using one keyword and then define it with the other, though compilers have been known to complain about this usage:

struct foo;
class foo {};

class bar;
struct bar {};

So, what of the minor semantic difference then? The change is in the default access specifier for members and base classes. Though classes defined using either keyword can have public, private and protected base classes and members, the default choice for classes defined using class is private, whilst for those defined using struct the default is public. This is primarily for backwards compatibility with C: the members of a C structure can be freely accessed by all code so in order to allow existing C code to compile unchanged as C++ the default access specifier for members of a class declared with struct must be public. On the other hand, private data is a key aspect of the encapsulation aspect of object-oriented design, so this is the default for those classes declare with class.

C doesn't have inheritance, but the default access specifier for base classes varies with the keyword used to declare the derived class too. It is public for classes declared with struct and private for those declared with class just the same as for data members. You can still override it with an explicit specifier in both cases.

Let's take a quick look at some examples to see how that works:

struct s1
{
    int a; // public
private:
    int b; // private
protected:
    int c; // protected
public:
    int d; // public again
};

class c1
{
    int a; // private
private:
    int b; // still private
protected:
    int c; // protected
public:
    int d; // public
};

struct s2:
    s1, // public
    private c1, // private
    type_b, // public again
    protected foo, // protected
    public bar // public again
{};

class c2:
    s1, // private
    private c1, // still private
    type_b, // private again
    protected foo, // protected
    public bar // public
{};

As far as declaring and defining user-defined types in C++, that is the only difference; in all other respects, classes declared with struct are identical to those declared with class.

C Compatibility

We touched on this a bit earlier: classes declared with the struct keyword can be compiled as C if they don't use any features that are C++ specific. Thus the following is both a valid C++ class and a valid C structure:

struct c_compatible
{
    int i;
    char c;
    double d;
};

It is therefore common to see struct used in header files that are shared between C and C++. Since non-virtual member functions don't affect the class layout you can even have member functions in such a type, provided they are hidden from the C compiler with a suitable #ifdef:

struct baz
{
    int i;

#ifdef __cplusplus
    void foo();
#endif;
};

Templates

There is one place where you can use the class keyword but not the struct one, and that is in the declaration of a template. Template type parameters must be declared using either the class or typename keyword; struct is not allowed. The choice of class or typename in this case is again arbitrary — the semantics are identical. The choice of keyword does not impose any semantic meaning, any type (whether a built in type like int or a user-defined type like a class or enumeration) can be used when instantiating the template in either case.You can of course declare a class template with the struct keyword, in which case the default access for the members of the template is public.

template<class T> // OK
void f1(T t);

template<typename T> // OK
void f2(T t);

template<struct T> // ERROR, struct not allowed here
void f3(T t);

template<class T>
struct S
{
    T x; // public member
};

That's all folks!

These are the only concrete distinctions between the uses of the struct keyword and the class keyword in C++. People also use them for documentation purposes, reserving struct for C-compatible classes, or classes with no member functions, or classes with no private data, or whatever their coding standard says. However, this is just documentation and convention rather than an inherent difference: you could use struct for all your classes, or class for all your classes except those that are shared with C.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Multithreading in C++0x part 8: Futures, Promises and Asynchronous Function Calls

Thursday, 11 February 2010

This is the eighth in a series of blog posts introducing the new C++0x thread library. See the end of this article for a full set of links to the rest of the series.

In this installment we'll take a look at the "futures" mechanism from C++0x. Futures are a high level mechanism for passing a value between threads, and allow a thread to wait for a result to be available without having to manage the locks directly.

Futures and asynchronous function calls

The most basic use of a future is to hold the result of a call to the new std::async function for running some code asynchronously:

#include <future>
#include <iostream>

int calculate_the_answer_to_LtUaE();
void do_stuff();

int main()
{
    std::future<int> the_answer=std::async(calculate_the_answer_to_LtUaE);
    do_stuff();
    std::cout<<"The answer to life, the universe and everything is "
             <<the_answer.get()<<std::endl;
}

The call to std::async takes care of creating a thread, and invoking calculate_the_answer_to_LtUaE on that thread. The main thread can then get on with calling do_stuff() whilst the immensely time consuming process of calculating the ultimate answer is done in the background. Finally, the call to the get() member function of the std::future<int> object then waits for the function to complete and ensures that the necessary synchronization is applied to transfer the value over so the main thread can print "42".

Sometimes asynchronous functions aren't really asynchronous

Though I said that std::async takes care of creating a thread, that's not necessarily true. As well as the function being called, std::async takes a launch policy which specifies whether to start a new thread or create a "deferred function" which is only run when you wait for it. The default launch policy for std::async is std::launch::any, which means that the implementation gets to choose for you. If you really want to ensure that your function is run on its own thread then you need to specify the std::launch::async policy:

  std::future<int> the_answer=std::async(std::launch::async,calculate_the_answer_to_LtUaE);

Likewise, if you really want the function to be executed in the get() call then you can specify the std::launch::sync policy:

  std::future<int> the_answer=std::async(std::launch::sync,calculate_the_answer_to_LtUaE);

In most cases it makes sense to let the library choose. That way you'll avoid creating too many threads and overloading the machine, whilst taking advantage of the available hardware threads. If you need fine control, you're probably better off managing your own threads.

Divide and Conquer

std::async can be used to easily parallelize simple algorithms. For example, you can write a parallel version of for_each as follows:

template<typename Iterator,typename Func>
void parallel_for_each(Iterator first,Iterator last,Func f)
{
    ptrdiff_t const range_length=last-first;
    if(!range_length)
        return;
    if(range_length==1)
    {
        f(*first);
        return;
    }

    Iterator const mid=first+(range_length/2);

    std::future<void> bgtask=std::async(&parallel_for_each<Iterator,Func>,
                                        first,mid,f);
    try
    {
        parallel_for_each(mid,last,f);
    }
    catch(...)
    {
        bgtask.wait();
        throw;
    }
    bgtask.get();   
}

This simple bit of code recursively divides up the range into smaller and smaller pieces. Obviously an empty range doesn't require anything to happen, and a single-point range just requires calling f on the one and only value. For bigger ranges then an asynchronous task is spawned to handle the first half, and then the second half is handled by a recursive call.

The try - catch block just ensures that the asynchronous task is finished before we leave the function even if an exception in order to avoid the background tasks potentially accessing the range after it has been destroyed. Finally, the get() call waits for the background task, and propagates any exception thrown from the background task. That way if an exception is thrown during any of the processing then the calling code will see an exception. Of course if more than one exception is thrown then some will get swallowed, but C++ can only handle one exception at a time, so that's the best that can be done without using a custom composite_exception class to collect them all.

Many algorithms can be readily parallelized this way, though you may want to have more than one element as the minimum range in order to avoid the overhead of spawning the asynchronous tasks.

Promises

An alternative to using std::async to spawn the task and return the future is to manage the threads yourself and use the std::promise class template to provide the future. Promises provide a basic mechanism for transferring values between threads: each std::promise object is associated with a single std::future object. A thread with access to the std::future object can use wait for the result to be set, whilst another thread that has access to the corresponding std::promise object can call set_value() to store the value and make the future ready. This works well if the thread has more than one task to do, as information can be made ready to other threads as it becomes available rather than all of them having to wait until the thread doing the work has completed. It also allows for situations where multiple threads could produce the answer: from the point of view of the waiting thread it doesn't matter where the answer came from, just that it is there so it makes sense to have a single future to represent that availability.

For example, asynchronous I/O could be modelled on a promise/future basis: when you submit an I/O request then the async I/O handler creates a promise/future pair. The future is returned to the caller, which can then wait on the future when it needs the data, and the promise is stored alongside the details of the request. When the request has been fulfilled then the I/O thread can set the value on the promise to pass the value back to the waiting thread before moving on to process additional requests. The following code shows a sample implementation of this pattern.

class aio
{
    class io_request
    {
        std::streambuf* is;
        unsigned read_count;
        std::promise<std::vector<char> > p;
    public:
        explicit io_request(std::streambuf& is_,unsigned count_):
            is(&is_),read_count(count_)
        {}
    
        io_request(io_request&& other):
            is(other.is),
            read_count(other.read_count),
            p(std::move(other.p))
        {}

        io_request():
            is(0),read_count(0)
        {}

        std::future<std::vector<char> > get_future()
        {
            return p.get_future();
        }

        void process()
        {
            try
            {
                std::vector<char> buffer(read_count);

                unsigned amount_read=0;
                while((amount_read != read_count) && 
                      (is->sgetc()!=std::char_traits<char>::eof()))
                {
                    amount_read+=is->sgetn(&buffer[amount_read],read_count-amount_read);
                }

                buffer.resize(amount_read);
                
                p.set_value(std::move(buffer));
            }
            catch(...)
            {
                p.set_exception(std::current_exception());
            }
        }
    };

    thread_safe_queue<io_request> request_queue;
    std::atomic_bool done;

    void io_thread()
    {
        while(!done)
        {
            io_request req=request_queue.pop();
            req.process();
        }
    }

    std::thread iot;
    
public:
    aio():
        done(false),
        iot(&aio::io_thread,this)
    {}

    std::future<std::vector<char> > queue_read(std::streambuf& is,unsigned count)
    {
        io_request req(is,count);
        std::future<std::vector<char> > f(req.get_future());
        request_queue.push(std::move(req));
        return f;
    }
    
    ~aio()
    {
        done=true;
        request_queue.push(io_request());
        iot.join();
    }
};

void do_stuff()
{}

void process_data(std::vector<char> v)
{
    for(unsigned i=0;i<v.size();++i)
    {
        std::cout<<v[i];
    }
    std::cout<<std::endl;
} 

int main()
{
    aio async_io;

    std::filebuf f;
    f.open("my_file.dat",std::ios::in | std::ios::binary);

    std::future<std::vector<char> > fv=async_io.queue_read(f,1048576);
    
    do_stuff();
    process_data(fv.get());
    
    return 0;
}

Next Time

The sample code above also demonstrates passing exceptions between threads using the set_exception() member function of std::promise. I'll go into more detail about exceptions in multithreaded next time.

Subscribe to the RSS feed or email newsletter for this blog to be sure you don't miss the rest of the series.

Try it out

If you're using Microsoft Visual Studio 2008 or g++ 4.3 or 4.4 on Ubuntu Linux you can try out the examples from this series using our just::thread implementation of the new C++0x thread library. Get your copy today.

Multithreading in C++0x Series

Here are the posts in this series so far:

Posted by Anthony Williams
[/ threading /] permanent link
Tags: concurrency, multithreading, C++0x, thread, future, promise, async
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

just::thread C++0x Thread Library V1.3 Released

Wednesday, 13 January 2010

I am pleased to announce that version 1.3 of just::thread, our C++0x Thread Library has just been released.

This release is the first to feature support for the new std::async function for starting asynchronous tasks. This provides a higher-level interface for managing threads than is available with std::thread, and allows your code to easily take advantage of the available hardware concurrency without excessive oversubscription.

This is also the first release to support 64-bit Windows.

The linux port is available for 32-bit and 64-bit Ubuntu linux, and takes full advantage of the C++0x support available from g++ 4.3 and g++ 4.4. The Windows port is available for Microsoft Visual Studio 2008 for both 32-bit and 64-bit Windows. Purchase your copy and get started NOW.

As usual, existing customers are entitled to a free upgrade to V1.3 from all earlier versions.

Posted by Anthony Williams
[/ news /] permanent link
Tags: multithreading, concurrency, C++0x
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Happy New Year 2010

Tuesday, 05 January 2010

It's already five days into 2010, but I'd like to wish you all a Happy New Year!

2009 was a good year for me. Back in January 2009, my implementation of the C++0x thread library went on sale, and sales have been growing steadily since — there's a new version due out any day now, with support for the new std::async functions and 64-bit Windows. I also presented at the ACCU conference for the second year running and completed the first draft of my book.

It's also been a big year for the C++ community. The biggest change is of course that "Concepts" were taken out of the C++0x draft since they were not ready. On the concurrency front, the proposal for the new std::async functions was accepted, std::unique_future was renamed to just std::future and the destructor of std::thread was changed to call std::terminate rather than detach if the thread has not been joined or detached.

What's coming in 2010?

Will 2010 be even better than 2009? I hope so. There's a new version of just::thread coming soon, and there's another ballot on the C++0x working draft due in the spring. I'll also be presenting at ACCU 2010 in April.

What are you looking forward to in 2010?

Posted by Anthony Williams
[/ news /] permanent link
Tags: popular, articles
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Previous Entries Later Entries

About Us

Technical Writings

Subscribe to Blog

Blog Archives

Blog Archive for / 2010 /

Tuesday, 07 December 2010

To move or not to move

Exceptions and Destructors

Concurrency-related papers

Other adopted papers

FCD comment status

Your comments

Friday, 29 October 2010

Thursday, 21 October 2010

To move or not to move

Exceptions and Destructors

Concurrency-related papers

Friday, 15 October 2010

Thursday, 14 October 2010

Wednesday, 06 October 2010

Wednesday, 08 September 2010

Concurrency-related papers

Other papers

Tuesday, 07 September 2010

Definition of Blocking

Definition of Non-blocking

Defintion of lock-free

Definition of wait-free

Choices

Monday, 09 August 2010

Tuesday, 27 July 2010

The code

The analysis

Second time around

Alternatives

Wednesday, 14 July 2010

Deducing the Referenced Type

Using std::cref to prevent modification

References to temporaries

Documentation

Recommendation

Thursday, 17 June 2010

Friday, 28 May 2010

Thursday, 06 May 2010

Monday, 19 April 2010

Thursday, 08 April 2010

Concurrency-related papers

Other Papers

Please comment on the FCD

Wednesday, 07 April 2010

Friday, 02 April 2010

Thursday, 25 March 2010

Tuesday, 23 February 2010

Concurrency-related papers

Other Papers

Sunday, 21 February 2010

Declaring and defining user-defined types

C Compatibility

Templates

That's all folks!

Thursday, 11 February 2010

Futures and asynchronous function calls

Sometimes asynchronous functions aren't really asynchronous

Divide and Conquer

Promises

Next Time

Try it out

Wednesday, 13 January 2010

Tuesday, 05 January 2010

Popular articles

What's coming in 2010?

Using `std::cref` to prevent modification