Just Software Solutions

Blog Archive

Slides for my ACCU 2017 presentation

Friday, 12 May 2017

A couple of weeks ago I presented at the ACCU conference in Bristol (ACCU 2017). The title of my presentation was "Concurrency, Parallelism and Coroutines", and I was talking about the relationship between the Concurrency TS, the Coroutines TS and the Parallel algorithms from C++17:

C++17 is adding parallel overloads of most of the Standard Library algorithms. There is a TS for Concurrency in C++ already published, and a TS for Coroutines in C++ and a second TS for Concurrency in C++ in the works.

What does all this mean for programmers? How are they all related? How do coroutines help with parallelism?

This session will attempt to answer these questions and more. We will look at the implementation of parallel algorithms, and how continuations, coroutines and work-stealing fit together. We will also look at how this meshes with the Grand Unified Executors Proposal, and how you will be able to take advantage of all this as an application developer.

It was well attended, with a lot of interesting questions.

The slides are available here.

This year, my presentation was recorded. The video is available on youtube.

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Getting tuple elements with a runtime index

Tuesday, 28 March 2017

std::tuple is great. It provides a nice, generic way of holding a fixed-size set of data items of whatever types you need.

However, sometimes it has limitations that mean it doesn't quite work as you'd like. One of these is accessing an item based on a runtime index.

std::get needs a compile-time index

The way to get the nth item in a tuple is to use std::get: std::get<n>(my_tuple). This works nicely, as long as n is a compile-time constant. If you've got a value that is calculated at runtime, this doesn't work: you can't use a value that isn't known until runtime as a template parameter.

std::tuple<int,int,int> my_tuple=...;
size_t index;
std::cin>>index;
int val=std::get<index>(my_tuple); // won't compile

So, what can we do? We need a new function, which I'll call runtime_get, to retrieve the nth value, where n is a runtime value.

template<typename Tuple>
... runtime_get(Tuple&& t,size_t index){
  ...
}

The question is: how do we implement it?

Fixed return type

The return type is easy: our function must have a single return type for any given Tuple. That means that all the elements in the tuple must have the same type, so we can just use the type of the first element. std::tuple_element will tell us this, though we must first adjust our template parameter so it's not a reference.

template<typename Tuple>
typename std::tuple_element<
  0,typename std::remove_reference<Tuple>::type>::type&
runtime_get(Tuple&& t,size_t index){
  ...
}

Note: C++17 includes std::variant, so you might think we could use that to hold the return type, but that wouldn't actually help us: to get the value from a variant, you need to call std::get<n>(v), which requires n to be a constant (again)!

OK, so the return type is just a reference to the type of the first element. How do we get the element?

Retrieving the nth element

We can't do a straightforward switch, because that requires knowing all the cases in advance, and we want this to work for any size of tuple.

One way would be to have a recursive function that checked the runtime index against a compile-time index, and then called the function with the next compile-time index if they were different, but that would mean that the access time would depend on the index, and potentially end up with a deeply nested function call if we wanted the last element in a large tuple.

One thing we can do is use the index value as an array index. If we have an array of functions, each of which returns the corresponding element from the tuple, then we can call the appropriate function to return the relevant index.

The function we need is of course std::get; it's just a matter of getting the function signature right. Our overload of std::get has the following signature for const and non-const tuples:

template <size_t I, class... Types>
constexpr tuple_element_t<I, tuple<Types...>>&
get(tuple<Types...>&) noexcept;

template <size_t I, class... Types>
constexpr const tuple_element_t<I, tuple<Types...>>&
get(const tuple<Types...>&) noexcept;

so, we can capture the relevant instantiation of std::get for a given tuple type Tuple in a function pointer declared as:

using return_type=typename std::tuple_element<0,Tuple>::type&;
using get_func_ptr=return_type(*)(Tuple&) noexcept;

The signature is the same, regardless of the index, because we made the decision that we're only going to support tuples where all the elements are the same.

This makes it easy to build a function table: use a variadic pack expansion to supply a different index for each array element, and fill in std::get<N> for each entry:

template<
  typename Tuple,
  typename Indices=std::make_index_sequence<std::tuple_size<Tuple>::value>>
struct runtime_get_func_table;

template<typename Tuple,size_t ... Indices>
struct runtime_get_func_table<Tuple,std::index_sequence<Indices...>>{
    using return_type=typename std::tuple_element<0,Tuple>::type&;
    using get_func_ptr=return_type (*)(Tuple&) noexcept;
    static constexpr get_func_ptr table[std::tuple_size<Tuple>::value]={
        &std::get<Indices>...
    };
};

template<typename Tuple,size_t ... Indices>
constexpr typename
runtime_get_func_table<Tuple,std::index_sequence<Indices...>>::get_func_ptr
runtime_get_func_table<Tuple,std::index_sequence<Indices...>>::table[
  std::tuple_size<Tuple>::value];

We need the separate redeclaration of the table to satisfy a pre-C++17 compiler; with C++17 inline variables it is no longer needed.

Our final function is then just a simple wrapper around a table lookup:

template<typename Tuple>
constexpr
typename std::tuple_element<0,typename std::remove_reference<Tuple>::type>::type&
runtime_get(Tuple&& t,size_t index){
    using tuple_type=typename std::remove_reference<Tuple>::type;
    if(index>=std::tuple_size<tuple_type>::value)
        throw std::runtime_error("Out of range");
    return runtime_get_func_table<tuple_type>::table[index](t);
}

It's constexpr safe, though in a constexpr context you could probably just use std::get directly anyway.

So, there you have it: a constant-time function for retrieving the nth element of a tuple where all the elements have the same type.

Final code

Here is the final code for a constant-time function to retrieve an item from a tuple based on a runtime index.

#include <tuple>
#include <utility>
#include <type_traits>
#include <stdexcept>

template<
  typename Tuple,
  typename Indices=std::make_index_sequence<std::tuple_size<Tuple>::value>>
struct runtime_get_func_table;

template<typename Tuple,size_t ... Indices>
struct runtime_get_func_table<Tuple,std::index_sequence<Indices...>>{
    using return_type=typename std::tuple_element<0,Tuple>::type&;
    using get_func_ptr=return_type (*)(Tuple&) noexcept;
    static constexpr get_func_ptr table[std::tuple_size<Tuple>::value]={
        &std::get<Indices>...
    };
};

template<typename Tuple,size_t ... Indices>
constexpr typename
runtime_get_func_table<Tuple,std::index_sequence<Indices...>>::get_func_ptr
runtime_get_func_table<Tuple,std::index_sequence<Indices...>>::table[std::tuple_size<Tuple>::value];

template<typename Tuple>
constexpr
typename std::tuple_element<0,typename std::remove_reference<Tuple>::type>::type&
runtime_get(Tuple&& t,size_t index){
    using tuple_type=typename std::remove_reference<Tuple>::type;
    if(index>=std::tuple_size<tuple_type>::value)
        throw std::runtime_error("Out of range");
    return runtime_get_func_table<tuple_type>::table[index](t);
}

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

C++ Concurrency in Action 2nd edition Early Access

Wednesday, 22 March 2017

I am happy to announce that the second edition of my book, C++ Concurrency in Action, is now available under the Manning Early Access Program.

The second edition is being updated to cover C++14, C++17 and the Concurrency TS, along with general improvements throughout the book.

This includes full coverage of the library changes from C++14 and C++17:

  • std::shared_mutex and std::shared_timed_mutex. These provide for multiple-reader/single-writer mutex locks.
  • std::scoped_lock from C++17 for locking multiple mutexes together.
  • Parallel overloads of many standard library algorithms include std::sort, std::for_each and std::transform_reduce.

Plus, full coverage of the library extensions from the concurrency TS:

  • std::experimental::latch to allow waiting for a set number of events to occur
  • std::experimental::barrier and std::experimental::flex_barrier to synchronize groups of threads
  • std::experimental::atomic_shared_ptr to allow atomic accesses to a single shared_ptr instance from multiple threads, as a better alternative that the std::atomic_load and std::atomic_store free functions.
  • Extended futures that allow continuations, so additional functions can be scheduled for when a future is ready.
  • std::experimental::when_all and std::experimental::when_any to allow waiting for either all of a set of futures to be ready, or the first of a set of futures to be ready.

Only the first few chapters are available at the moment, but if you sign up now, then you will get those chapters in PDF form now, as well as updated PDFs including the later chapters as they become ready, along with updates for the earlier ones, and a final print copy of the book when it's done.

50% Discount

If you use the code mlwilliams4 at the checkout when you sign up for the MEAP before 27th March 2017 then you'll get a 50% discount.

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Just::Thread Pro v2.4.2 released with clang support

Thursday, 16 March 2017

I am pleased to announce that Just::Thread Pro v2.4.2 has been released with support for clang on linux.

Just::Thread Pro is our C++ concurrency extensions library which provides an Actor framework for easier concurrency, along with concurrent data structures: a thread-safe queue, and concurrent hash map, and a wrapper for ensuring synchronized access to single objects.

It also includes the new facilities from the Concurrency TS:

Clang support is finally here!

V2.4.2 adds the much-anticipated support for clang. clang 3.8 and 3.9 are supported on ubuntu 16.04 or later, clang 3.8 is supported on Fedora 24, and clang 3.9 on Fedora 25.

Just::Thread Pro is now fully supported on the following compiler/OS combinations (32-bit and 64-bit):

  • Microsoft Visual Studio 2015 for Windows
  • Microsoft Visual Studio 2017 for Windows
  • gcc 5 for Ubuntu 14.04 or later
  • gcc 6 for Ubuntu 14.04 or later
  • clang 3.8 for Ubuntu 16.04 or later
  • clang 3.9 for Ubuntu 16.04 or later
  • gcc 5 for Fedora 22 and 23
  • gcc 6 for Fedora 24 and 25
  • clang 3.8 for Fedora 24
  • clang 3.9 for Fedora 25

Just::Thread Pro v2.2 is also supported with the Just::Thread compatibility library on the following compiler/OS combinations:

  • Microsoft Visual Studio 2005, 2008, 2010, 2012 and 2013 for Windows
  • TDM gcc 4.5.2, 4.6.1 and 4.8.1 for Windows
  • g++ 4.3 or later for Ubuntu 9.04 or later
  • g++ 4.4 or later for Fedora 13 or later
  • g++ 4.4 for Centos 6
  • MacPorts g++ 4.3 to 4.8 on MacOSX Snow Leopard or later

All licences include a free upgrade to point releases, so if you purchase now you'll get a free upgrade to all 2.x releases of Just::Thread Pro. Purchasers of the older Just::Thread library (now called the compatibility library) may upgrade to Just::Thread Pro for a small fee.

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Does const mean thread-safe?

Tuesday, 14 March 2017

There was a discussion recently on the cpplang slack about whether const meant thread-safe.

As with everything in life, it depends. In some ways, yes it does. In others, it does not. Read on for the details.

What do we mean by "thread-safe"?

If someone says something is "thread-safe", then it is important to define what that means. Here is an incomplete list of some things people might mean when they say something is "thread safe".

  • Calling foo(x) on one thread and foo(y) on a second thread concurrently is OK.
  • Calling foo(x_i) on any number of threads concurrently is OK, provided each x_i is different.
  • Calling foo(x) on a specific number of threads concurrently is OK.
  • Calling foo(x) on any number of threads concurrently is OK.
  • Calling foo(x) on one thread and bar(x) on another thread concurrently is OK.
  • Calling foo(x) on one thread and bar(x) on any number of threads concurrently is OK.

Which one we mean in a given circumstance is important. For example, a concurrent queue might be a Single-Producer, Single-Consumer queue (SPSC), in which case it is safe for one thread to push a value while another pops a value, but if two threads try and push values then things go wrong. Alternatively, it might be a Multi-Producer, Single-Consumer queue (MPSC), which allows multiple threads to push values, but only one to pop values. Both of these are "thread safe", but the meaning is subtly different.

Before we look at what sort of thread safety we're after, let's just define what it means to be "OK".

Data races

At the basic level, when we say an operation is "OK" from a thread-safety point of view, we mean it has defined behaviour, and there are no data races, since data races lead to undefined behaviour.

From the C++ Standard perspective, a data race is where there are 2 operations that access the same memory location, such that neither happens-before the other, at least one of them modifies that memory location, and at least one of them is not an atomic operation.

An operation is thus "thread safe" with respect to the set of threads we wish to perform the operation concurrently if:

  • none of the threads modify a memory location accessed by any of the other threads, or
  • all accesses performed by the threads to memory locations which are modified by one or more of the threads are atomic, or
  • the threads use suitable synchronization to ensure that there are happens-before operations between all modifications, and any other accesses to the modified memory locations.

So: what sort of thread-safety are we looking for from const objects, and why?

Do as ints do

A good rule of thumb for choosing behaviour for a class in C++ is "do as ints do".

With regard to thread safety, ints are simple:

  • Any number of threads may read a given int concurrently
  • If any thread modifies a given int, no other threads may access that int for reading or writing concurrently.

This follows naturally from the definition of a data race, since ints cannot do anything special to provide synchronization or atomic operations.

If you have an int, and more than one thread that wants to access it, if any of those threads wants to modify it then you need external synchronization. Typically you'll use a mutex for the external synchronization, but other mechanisms can work too.

If your int is const, (or you have const int& that references it, or const int* that points to it) then you can't modify it.

What does that mean for your class? In a well-designed class, the const member functions do not modify the state of the object. It is "logically" immutable when accessed exclusively through const member functions. On the other hand, if you use a non-const member function then you are potentially modifying the state of the object. So far, so good: this is what ints do with regard to reading and modifying.

To do what ints do with respect to thread safety, we need to ensure that it is OK to call any const member functions concurrently on any number of threads. For many classes this is trivially achieved: if you don't modify the internal representation of the object in any way, you're home dry.

Consider an employee class that stores basic information about an employee, such as their name, employee ID and so forth. The natural implementation of const member functions will just read the members, perform some simple manipulation on the values, and return. Nothing is modified.

class employee{
    std::string first_name;
    std::string last_name;
    // other data
public:
    std::string get_full_name() const{
        return last_name + ", " + first_name;
    }
    // other member functions
}

Provided that reading from a const std::string and appending it to another string is OK, employee::get_full_name is OK to be called from any number of threads concurrently.

You only have to do something special to "do as ints do" if you modify the internal state in your const member function, e.g. to keep a tally of calls, or cache calculation values, or similar things which modify the internal state without modifying the externally-visible state. Of course, you would also need to add some synchronization if you were modifying externally-visible state in your const member function, but we've already decided that's not a good plan.

In employee::get_full_name, we're relying on the thread-safety of std::string to get us through. Is that OK? Can we rely on that?

Thread-safety in the C++ Standard Library

The C++ Standard Library itself sticks to the "do as ints do" rule. This is spelled out in the section on Data race avoidance (res.on.data.races). Specifically,

A C++ standard library function shall not directly or indirectly modify objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's non-const arguments, including this.

and

Implementations may share their own internal objects between threads if the objects are not visible to users and are protected against data races.

This means that if you have a const std::string& referencing an object, then any calls to member functions on it must not modify the object, and any shared internal state must be protected against data races. The same applies if it is passed to any other function in the C++ Standard Library.

However, if you have a std::string& referencing the same object, then you must ensure that all accesses to that object must be synchronized externally, otherwise you may trigger a data race.

Our employee::get_full_name function is thus as thread-safe as an int: concurrent reads are OK, but any modifications will require external synchronization for all concurrent accesses.

There are two little words in the first paragraph quoted above which have a surprising consequence: "or indirectly".

Indirect Accesses

If you have two const std::vector<X>s, vx and vy, then calling standard library functions on those objects must not modify any objects accessible by other threads, otherwise we've violated the requirements from the "data race avoidance" section quoted above, since those objects would be "indirectly" modified by the function.

This means that any operations on the X objects within those containers that are performed by the operations we do on the vectors must also refrain from modifying any objects accessible by other threads. For example, the expression vx==vy compares each of the elements in turn. These comparisons must thus not modify any objects accessible by other threads. Likewise, std::for_each(vx.begin(),vx.end(),foo) must not modify any objects accessible by other threads.

This pretty much boils down to a requirement that if you use your class with the C++ Standard Library, then const operations on your class must be safe if called from multiple threads. There is no such requirement for non-const operations, or combinations of const and non-const operations.

You may of course decide that your class is going to allow concurrent modifications (even if that is by using a mutex to restrict accesses internally), but that is up to you. A class designed for concurrent access, such as a concurrent queue, will need to have the internal synchronization; a value class such as our employee is unlikely to need it.

Summary

Do as ints do: concurrent calls to const member functions on your class must be OK, but you are not required to ensure thread-safety if there are also concurrent calls to non-const member functions.

The C++ Standard Library sticks to this rule itself, and requires it of your code. In most cases, it's trivial to ensure; it's only classes with complex const member functions that modify internal state that may need synchronization added.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

just::thread Pro adds Visual Studio 2017 support

Wednesday, 08 March 2017

I am pleased to announce that just::thread Pro now supports Microsoft Visual Studio 2017 on Microsoft Windows.

This adds to the support for Microsoft Visual Studio 2015, g++ 5 and g++ 6 for the just::thread Pro enhancements, which build on top of the platform-supplied version of the C++14 thread library. For older compilers, and for MacOSX, the just::thread compatibility library is still required.

The new build features all the same facilities as the previous release:

Get your copy of just::thread Pro

Purchase your copy and get started now.

As usual, all customers with V2.x licenses of just::thread Pro will get a free upgrade to the new just::thread Pro Standalone edition.

Posted by Anthony Williams
[/ news /] permanent link
Tags: , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Generating Sequences

Monday, 27 February 2017

I was having a discussion with my son over breakfast about C++ and Python, and he asked me if C++ had anything equivalent to Python's range() function for generating a sequence of integers. I had to tell him that no, the C++ standard library didn't supply such a function, but there were algorithms for generating sequences (std::generate and std::generate_n) into an existing container, and you could write something that would provide a "virtual" container that would supply a sequence as you iterated over it with range-for.

He was happy with that, but I felt the urge to write such a container, and blog about it. There are existing implementations available (e.g. boost::counting_range), but hopefully people will find this instructive, if not useful.

Iterating over a "virtual" container

Our initial goal is to be able write

for(int x: range(10))
  std::cout<<x<<std::endl;

and have it print out the numbers 0 to 9. This requires that our new range function returns something we can use with range-for — something for which we can call begin and end to get iterators.

    class numeric_range{
    public:
        class iterator;
        iterator begin();
        iterator end();
    };

    numeric_range range(int count);

Our container is virtual, so we don't have real values we can refer to. This means our iterators must be input iterators — forward iterators need real objects to refer to. That's OK though; we're designing this for use with range-for, which only does one pass anyway.

Input Iterator requirements

All iterators have 5 associated types. They can either be provided as member types/typedefs, or by specializing std::iterator_traits<>. If you provide them as part of your iterator class, then the std::iterator_traits<> primary template works fine, so we'll do that. The types are:

iterator_category

The tag type for the iterator category: for input iterators it's std::input_iterator_tag.

value_type

This is the type of the element in the sequence. For an integer sequence, it would be int, but we'll want to allow sequences of other types, such as unsigned, double, or even a custom class.

reference

The result of operator* on an iterator. For forward iterators it has to be an lvalue reference, but for our input iterator it can be anything convertible to value_type. For simplicity, we'll make it the same as value_type.

pointer

The return type of operator-> on an iterator. value_type* works for us, and is the simplest option.

difference_type

The type used to represent the difference between two iterators. For input iterators, this is irrelevant, as they provide a one-pass abstraction with no exposed size. We can therefore use void.

The types aren't the only requirements on input iterators: we also have to meet the requirements on valid expressions.

Input Iterator expression requirements

Input iterators can be valid or invalid. If an iterator is invalid (e.g. because it was invalidated by some operation), then doing anything to it except assigning to it or destroying it is undefined behaviour, so we don't have to worry about that. All requirements below apply only to valid iterators.

Valid iterators can either refer to an element of a sequence, in which case they are dereferencable, or they can be past-the-end iterators, which are not dereferencable.

Comparisons

For iterators a and b, a==b is only a valid expression if a and b refer to the same sequence. If they do refer to the same sequence, a==b is true if and only if both a and b are past-the-end iterators, or both a and b are dereferencable iterators that refer to the same element in the sequence.

a!=b returns !(a==b), so is again only applicable if a and b are from the same sequence.

Incrementing

Only dereferencable iterators can be incremented.

If a is dereferencable, ++a must move a to the next value in the sequence, or make a a past-the-end iterator. Incrementing an iterator a may invalidate all copies of a: input iteration is single-pass, so you cannot use iterators to an element after you've incremented any iterator that referenced that element. ++a returns a reference to a.

(void)a++ is equivalent to (void)++a. The return type is unspecified, and not relevant to the increment.

Dereferencing

For a dereferencable iterator a, *a must return the reference type for that iterator, which must be convertible to the value_type. Dereferencing the same iterator multiple times without modifying the iterator must return an equivalent value each time. If a and b are iterators such that a==b is true, *a and *b must be return equivalent values.

a->m must be equivalent to (*a).m.

*a++ must return a value convertible to the value_type of our iterator, and is equivalent to calling following the function:

iterator::value_type increment_and_dereference(iterator& a) {
  iterator::value_type temp(*a);
  ++a;
  return temp;
}

This is why the return type for a++ is unspecified: for some iterators it will need to be a special type that can generate the required value on dereferencing; for others it can be a reference to a real value_type object.

Basic implementation

Our initial implementation can be really simple. We essentially just need a current value, and a final value. Our dereferencable iterators can then just hold a pointer to the range object, and past-the-end iterators can hold a null pointer. Iterators are thus equal if they are from the same range. Comparing invalidated iterators, or iterators from the wrong range is undefined behaviour, so that's OK.

class numeric_range{
    int current;
    int final;
public:
    numeric_range(int final_):current(0),final(final_){}

    iterator begin(){ return iterator(this); }
    iterator end() { return iterator(nullptr); }

    class iterator{
      numeric_range* range;
    public:
      using value_type = T;
      using reference = T;
      using iterator_category=std::input_iterator_tag;
      using pointer = T*;
      using difference_type = void;

      iterator(numeric_range* range_):range(range_){}

      int operator*() const{ return range->current; }
      int* operator->() const{ return &range->current;}

      iterator& operator++(){ // preincrement
          ++range->current;
          if(range->current==range->final)
            range=nullptr;
            return *this;
      }

      ??? operator++(int); // postincrement

      friend bool operator==(iterator const& lhs,iterator const& rhs){
        return lhs.range==rhs.range;
      }
      friend bool operator!=(iterator const& lhs,iterator const& rhs){
        return !(lhs==rhs);
      }
    };
};

numeric_range range(int max){
  return numeric_range(max);
}

Note that I've left out the definition of the iterator postincrement operator. That's because it warrants a bit more discussion.

Remember the spec for *a++: equivalent to calling

iterator::value_type increment_and_dereference(iterator& a) {
  iterator::value_type temp(*a);
  ++a;
  return temp;
}

Since this is a combination of two operators, we can't do it directly: instead, our postincrement operator has to return a special type to hold the temporary value. Our postincrement operator thus looks like this:

class numeric_range::iterator{
  class postinc_return{
    int value;
  public:
    postinc_return(int value_): value(value_){}
    int operator*(){ return value; }
  };
public:
  postinc_return operator++(int){
    postinc_return temp(range->current);
    ++*this;
    return temp;
  }
};

That's enough for our initial requirement:

int main(){
    for(int x: range(10)){
        std::cout<<x<<",";
    }
    std::cout<<std::endl;
}

will now print 0 to 9.

More complete implementation

The Python range function provides more options than just this, though: you can also specify start and end values, or start, end and step values, and the step can be negative. We might also like to handle alternative types, such as unsigned or double, and we need to ensure we handle things like empty ranges.

Alternative types is mostly easy: just make numeric_range a class template, and replace int with our new template parameter T everywhere.

Setting the initial and final values is also easy: just provide a new constructor that takes both the current and final values, rather than initializing the current value to 0.

Steps are a bit more tricky: if the final value is not initial+n*step for an integer n, then the direct comparison of current==final to check for the end of the sequence won't work, as it will always be false. We therefore need to check for current>=final, to account for overshoot. This then doesn't work for negative steps, so we need to allow for that too: if the step is negative, we check for current<=final instead.

Final code

The final code is available for download with simple examples, under the BSD license.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Just::Thread Pro v2.4 released

Friday, 10 February 2017

I am pleased to announce that Just::Thread Pro v2.4 has been released.

Just::Thread Pro is our C++ concurrency extensions library which provides an Actor framework for easier concurrency, along with concurrent data structures: a thread-safe queue, and concurrent hash map, and a wrapper for ensuring synchronized access to single objects.

It also includes the new facilities from the Concurrency TS:

Unwrapping the future

V2.4 adds automatic future unwrapping, as specified by the Concurrency TS, so a future<T> can be constructed from a future<future<T>>. This also works with continuations, so if your continuation returns a future<T>, the return value from then is still a future<T>, whereas if your continuation returns a T that is not a future, then then returns future<T>.

This is great for continuation-style concurrency, as it allows you to compose continuations easily, and use asynchronous calls within continuations without adding an additional layer of future-wrapping.

std::experimental::future<InitialData> first_task();
std::experimental::future<IntermediateData> second_task(InitialData data);
std::experimental::future<FinalData> third_task(IntermediateData data);

std::experimental::future<FinalData> f=first_task.then([](auto data){
    return second_task(data.get());
}).then([](auto data){
    return third_task(data.get());
});

New compiler/OS support

V2.4 also adds support for gcc 5 on Fedora 22 and 23, and gcc 6 on Fedora 24 and 25.

Just::Thread Pro is now fully supported on the following compiler/OS combinations (32-bit and 64-bit):

  • Microsoft Visual Studio 2015 for Windows
  • gcc 5 for Ubuntu 14.04 or later
  • gcc 6 for Ubuntu 14.04 or later
  • gcc 5 on Fedora 22 and 23
  • gcc 6 on Fedora 24 and 25

Just::Thread Pro v2.2 is also supported with the Just::Thread compatibility library on the following compiler/OS combinations:

  • Microsoft Visual Studio 2005, 2008, 2010, 2012 and 2013 for Windows
  • TDM gcc 4.5.2, 4.6.1 and 4.8.1 for Windows
  • g++ 4.3 or later for Ubuntu 9.04 or later
  • g++ 4.4 or later for Fedora 13 or later
  • g++ 4.4 for Centos 6
  • MacPorts g++ 4.3 to 4.8 on MacOSX Snow Leopard or later

All licences include a free upgrade to point releases, so if you purchase now you'll get a free upgrade to all 2.x releases of Just::Thread Pro.

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Wrapping Callbacks with Futures

Friday, 03 February 2017

Libraries the perform time-consuming operations, or network-based operations, often provide a means of running the task asynchronously, so that your code can continue with other things while this operation is performed in the background.

Such functions often allow you to provide a callback which is invoked when the operation has completed. The result of the operation may be supplied directly to the callback, or the callback might be expected to make further calls into the library to detect the result.

Either way, though it is a useful facility, it doesn't work well with code that needs to wait for the result — for this, you need something that can be waited on. Futures are an ideal such mechanism, because they provide a common mechanism for waiting that is abstracted away from the details of this specific library. They also allow the result to be transferred directly to the waiting thread, again abstracting away the details.

So, what do you do if the library you want to use provides a callback facility, and not a future-based wait facility? You wrap the callback in a future.

Promises, promises

The key to wrapping a callback in a future is the promise. A promise is the producer side of a future: you set the value through the promise in order to consume it via the future.

In C++, promises are provided through the std::promise class template; the template parameter specifies the type of the data being transferred, and thus is the same as the template parameter of the std::future that you will eventually be returning to the user. When you call prom.set_value() on some promise prom, then the corresponding future (retrieved by calling prom.get_future()) will become ready, holding the relevant value.

Promises and futures are not unique to C++; they are available in at least JavaScript, Python, Java and Scala, and pretty much any modern concurrency library for any language will have something that is equivalent. Adapting the examples to your favourite language is left as an exercise for the reader.

Simple callbacks

The most convenient case for us when trying to wrap a function that takes a callback is where the function we are calling takes anything that is callable as the callback. In C++ this might be represented as an instantiation of std::function. e.g.

class request_data;
class retrieved_data;

void async_retrieve_data(
    request_data param,
    std::function<void(retrieved_data)> callback);

What we need to do to wrap this is the following:

  1. Create a promise.
  2. Get the future from the promise.
  3. Construct a callable object that can hold the promise, and will set the value on the promise when called.
  4. Pass that object as the callback.
  5. Return the future that we obtained in step 2.

This is the same basic set of steps as we'll be doing in all the examples that follow; it is the details that will differ.

Note: std::function requires that the callable object it wraps is copyable (so that if the std::function object itself is copied, it can copy the wrapped callable object), so we cannot hold the std::promise itself by value, as promises are not copyable.

We can thus write this using a C++ lambda as the callable object:

std::future<retrieved_data> wrapped_retrieve_data(request_data param) {
    std::shared_ptr<std::promise<retrieved_data>> prom=
        std::make_shared<std::promise<retrieved_data>>();
    std::future<retrieved_data> res=prom->get_future();
    async_retrieve_data(
        param,
        [prom](retrieved_data result){
            prom->set_value(result);
    });
    return res;
}

Here, we're using a std::shared_ptr to provide a copyable wrapper for the promise, so that it can be copied into the lambda, and the lambda itself will be copyable. When the copy of the lambda is called, it sets the value on the promise through its copy of the std::shared_ptr, and the future that is returned from wrapped_retrieve_data will become ready.

That's all very well if the function uses something like std::function for the callback. However, in practice that's not often the case. More often you have something that takes a plain function and a parameter to pass to this function; an approach inherited from C. Indeed, many APIs that you might wish to wrap are C APIs.

Plain function callbacks with a user_data parameter

A function that takes a plain function for the callback and a user_data parameter to pass to the function often looks something like this:

void async_retrieve_data(
    request_param param,
    void (*callback)(uintptr_t user_data,retrieved_data data),
    uintptr_t user_data);

The user_data you supply to async_retrieve_data is passed as the first parameter of your callback when the data is ready.

In this case, wrapping out callback is a bit more tricky, as we cannot just pass our lambda directly. Instead, we must create an object, and pass something to identify that object via the user_data parameter. Since our user_data is uintptr_t, it is large enough to hold a pointer, so we can cast the pointer to our object to uintptr_t, and pass it as the user_data. Our callback can then cast it back before using it. This is a common approach when passing C++ objects through C APIs.

The issue is: what object should we pass a pointer to, and how will its lifetime be managed?

One option is to just allocate our std::promise on the heap, and pass the pointer to that:

void wrapped_retrieve_data_callback(uintptr_t user_data,retrieved_data data) {
    std::unique_ptr<std::promise<retrieved_data>> prom(
        reinterpret_cast<std::promise<retrieved_data>*>(user_data));
    prom->set_value(data);
}

std::future<retrieved_data> wrapped_retrieve_data(request_data param) {
    std::unique_ptr<std::promise<retrieved_data>> prom=
        std::make_unique<std::promise<retrieved_data>>();
    std::future<retrieved_data> res=prom->get_future();
    async_retrieve_data(
        param,
        wrapped_retrieve_data_callback,
        reinterpret_cast<uintptr_t>(prom->get()));
    prom.release();
    return res;
}

Here, we use std::make_unique to construct our promise, and give us a std::unique_ptr pointing to it. We then get the future as before, and call the function we're wrapping, passing in the raw pointer, cast to an integer. We then call release on our pointer, so the object isn't deleted when we return from the function.

In our callback, we then cast the user_data parameter back to a pointer, and construct a new std::unique_ptr object to take ownership of it, and ensure it gets deleted. We can then set the value on our promise as before.

This is a little bit more convoluted than the lamda version from before, but it works in more cases. Often the APIs will take a void* rather than a uintptr_t, in which case you only need a static_cast rather than the scary reinterpret_cast, but the structure is the same.

An alternative to heap-allocating the promise directly is to store it in a global container (e.g. a std::list<std::promise<T>>), provided that its address can't change. You then need to ensure that it gets destroyed at a suitable point, otherwise you'll end up with a container full of used promises.

If you've only got C++11 futures, then the advantages to wrapping the callback-based API like so is primarily about abstracting away the interface, and providing a means of waiting for the result. However, if your library provides the extended futures from the C++ Concurrency TS then you can benefit from continuations to add additional functions to call when the data is ready, without having to modify the callback.

Summary

Wrapping asynchronous functions that take callbacks with futures provides a nice abstraction boundary to separate the details of the API call from the rest of your code. It is a common pattern in all languages that provide the future/promise abstraction, especially where that abstraction allows for continuations.

If you have any thoughts on this, let me know in the comments below.

Posted by Anthony Williams
[/ threading /] permanent link
Tags: , , , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

TDD isn't a panacea

Wednesday, 25 January 2017

On Monday, I attended the Software Cornwall Business Connect Event. This was a day of talks and workshops for people in the local software development community here in Cornwall. One of the talks was on TDD, and how that fitted into the software development process at one of the larger software companies in the area.

There was an interesting question asked by one of the attendees: how do you know that your tests are correct? What if you make a mistake in your tests, and then change the code to make it pass?

The answer that the presenter gave, and the one that is most common in the literature, is that the tests check the code, and the code checks the tests. There was a brief discussion around this point, but I thought that it was worth elaborating here.

Baby steps

One of the key parts of TDD is the idea of "baby steps". You write code incrementally, with small changes, and small, focused tests.

Consequently, when you write a new test, you're testing some small change you're about to make. If you make a mistake, this gives you a reasonable chance of spotting it.

However, that's still relying on you spotting it yourself. You just wrote the test, and it's very easy to read back what you intended to write, rather than what you actually did write. This is where "the code checks the tests" comes in.

Red

The TDD cycle is often dubbed "Red, Green, Refactor". At this point, the relevant part is "Red" — you just wrote a new test, so it should fail. If you run it, and it passes then something is wrong, most likely a mistake in the test. This gives you a second chance to revisit the test, and ensure that it is indeed correct. Maybe you mistyped one of the values; maybe you missed a function call. Either way, this gives you a second chance to verify the test.

OK, so you wrote the test (wrong), the test fails (as expected), so you write the code to make it pass. This gives you the next opportunity to verify the test.

Green

TDD doesn't exempt you from thinking. You don't just write a test and then blindly make it pass. You write a test, often with an idea of the code change you're going to make to make it pass.

So, you've written a failing test, and you make the desired change. You run the tests, and the new test still fails. You haven't got the "Green" outcome you desired or intended. Obviously, the first thing to check here is the code, but it also gives you another chance to check the test. If the code really does do what you intended, then the test must be wrong, and vice versa.

Of course, we don't have just the one test, unless this is the beginning of a project. We have other tests on the code. If the new test is incorrect, and thus inconsistent with the previous tests, then changing the code to make the new test pass may well cause previous tests to fail.

At this point, you've got to be certain that the test is right if you're going to change the code. Alternatively, you made exactly the same mistake in the code as you did in the test, and the test passes anyway. Either way, this is the point at which the faulty test becomes faulty code.

It's not the last chance we have to check the test, though: the final step in the TDD loop is "Refactor".

Refactor

Refactoring is about simplifying your code — removing duplication, applying general algorithms rather than special cases, and so forth. If you've made a mistake in a test, then the implementation may well end up with a special case to handle that particular eventuality; the code will be resistant to simplification because that test is inconsistent with the others. Again, TDD does not exempt you from thinking. If you intended to implement a particular algorithm, and the code does not simplify in a way that is consistent with that algorithm then this is a red flag — something is not right, and needs checking. This gives you yet another chance to inspect the test, and verify that it is indeed correct.

At this point, if you still haven't fixed your test, then you have now baked the mistake into your code. However, all is not lost: "Red, Green, Refactor" is a cycle, so we begin again, with the next test.

More tests

As we add more tests, we get more and more opportunities to verify the correctness of the existing tests. As we adjust the code to make future tests pass, we can spot inconsistencies, and we can observe that changes we make break the earlier test. When we see the broken test, this gives us a further opportunity to verify that the test is indeed correct, and potentially fix it, rather than fixing the code.

Acceptance tests then provide a further level of checking, though these tend to be far courser-grained, so may work fine unless the particular inputs happen to map to those in the incorrect lower level test.

TDD is not a panacea; but it is a useful tool

So, it is possible that a mistake in a test will be matched by incorrect code, and this will make it into production. However, in my experience, this is unlikely if the incorrect test is merely a mistake. Instead, the bugs that make it through tend to fall into two categories:

  • errors of omission — I didn't think to check that scenario, or
  • errors due to misunderstanding — my understanding of the requirements didn't match what was wanted, and the code does exactly what I intended, but not what the client intended.

TDD is not a panacea, and won't fix all your problems, but it is a useful tool, and has the potential to greatly reduce the number of bugs in your code.

Please let me know what you think by adding a comment.

Posted by Anthony Williams
[/ testing /] permanent link
Tags: , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Older entries

Design and Content Copyright © 2005-2017 Just Software Solutions Ltd. All rights reserved.