Blog Archive for / 2017 / 02 /

Generating Sequences

Monday, 27 February 2017

I was having a discussion with my son over breakfast about C++ and Python, and he asked me if C++ had anything equivalent to Python's range() function for generating a sequence of integers. I had to tell him that no, the C++ standard library didn't supply such a function, but there were algorithms for generating sequences (std::generate and std::generate_n) into an existing container, and you could write something that would provide a "virtual" container that would supply a sequence as you iterated over it with range-for.

He was happy with that, but I felt the urge to write such a container, and blog about it. There are existing implementations available (e.g. boost::counting_range), but hopefully people will find this instructive, if not useful.

Iterating over a "virtual" container

Our initial goal is to be able write

for(int x: range(10))
  std::cout<<x<<std::endl;

and have it print out the numbers 0 to 9. This requires that our new range function returns something we can use with range-for — something for which we can call begin and end to get iterators.

    class numeric_range{
    public:
        class iterator;
        iterator begin();
        iterator end();
    };

    numeric_range range(int count);

Our container is virtual, so we don't have real values we can refer to. This means our iterators must be input iterators — forward iterators need real objects to refer to. That's OK though; we're designing this for use with range-for, which only does one pass anyway.

Input Iterator requirements

All iterators have 5 associated types. They can either be provided as member types/typedefs, or by specializing std::iterator_traits<>. If you provide them as part of your iterator class, then the std::iterator_traits<> primary template works fine, so we'll do that. The types are:

`iterator_category`

The tag type for the iterator category: for input iterators it's std::input_iterator_tag.

`value_type`

This is the type of the element in the sequence. For an integer sequence, it would be int, but we'll want to allow sequences of other types, such as unsigned, double, or even a custom class.

`reference`

The result of operator* on an iterator. For forward iterators it has to be an lvalue reference, but for our input iterator it can be anything convertible to value_type. For simplicity, we'll make it the same as value_type.

`pointer`

The return type of operator-> on an iterator. value_type* works for us, and is the simplest option.

`difference_type`

The type used to represent the difference between two iterators. For input iterators, this is irrelevant, as they provide a one-pass abstraction with no exposed size. We can therefore use void.

The types aren't the only requirements on input iterators: we also have to meet the requirements on valid expressions.

Input Iterator expression requirements

Input iterators can be valid or invalid. If an iterator is invalid (e.g. because it was invalidated by some operation), then doing anything to it except assigning to it or destroying it is undefined behaviour, so we don't have to worry about that. All requirements below apply only to valid iterators.

Valid iterators can either refer to an element of a sequence, in which case they are dereferencable, or they can be past-the-end iterators, which are not dereferencable.

Comparisons

For iterators a and b, a==b is only a valid expression if a and b refer to the same sequence. If they do refer to the same sequence, a==b is true if and only if both a and b are past-the-end iterators, or both a and b are dereferencable iterators that refer to the same element in the sequence.

a!=b returns !(a==b), so is again only applicable if a and b are from the same sequence.

Incrementing

Only dereferencable iterators can be incremented.

If a is dereferencable, ++a must move a to the next value in the sequence, or make a a past-the-end iterator. Incrementing an iterator a may invalidate all copies of a: input iteration is single-pass, so you cannot use iterators to an element after you've incremented any iterator that referenced that element. ++a returns a reference to a.

(void)a++ is equivalent to (void)++a. The return type is unspecified, and not relevant to the increment.

Dereferencing

For a dereferencable iterator a, *a must return the reference type for that iterator, which must be convertible to the value_type. Dereferencing the same iterator multiple times without modifying the iterator must return an equivalent value each time. If a and b are iterators such that a==b is true, *a and *b must be return equivalent values.

a->m must be equivalent to (*a).m.

*a++ must return a value convertible to the value_type of our iterator, and is equivalent to calling following the function:

iterator::value_type increment_and_dereference(iterator& a) {
  iterator::value_type temp(*a);
  ++a;
  return temp;
}

This is why the return type for a++ is unspecified: for some iterators it will need to be a special type that can generate the required value on dereferencing; for others it can be a reference to a real value_type object.

Basic implementation

Our initial implementation can be really simple. We essentially just need a current value, and a final value. Our dereferencable iterators can then just hold a pointer to the range object, and past-the-end iterators can hold a null pointer. Iterators are thus equal if they are from the same range. Comparing invalidated iterators, or iterators from the wrong range is undefined behaviour, so that's OK.

class numeric_range{
    int current;
    int final;
public:
    numeric_range(int final_):current(0),final(final_){}

    iterator begin(){ return iterator(this); }
    iterator end() { return iterator(nullptr); }

    class iterator{
      numeric_range* range;
    public:
      using value_type = T;
      using reference = T;
      using iterator_category=std::input_iterator_tag;
      using pointer = T*;
      using difference_type = void;

      iterator(numeric_range* range_):range(range_){}

      int operator*() const{ return range->current; }
      int* operator->() const{ return &range->current;}

      iterator& operator++(){ // preincrement
          ++range->current;
          if(range->current==range->final)
            range=nullptr;
            return *this;
      }

      ??? operator++(int); // postincrement

      friend bool operator==(iterator const& lhs,iterator const& rhs){
        return lhs.range==rhs.range;
      }
      friend bool operator!=(iterator const& lhs,iterator const& rhs){
        return !(lhs==rhs);
      }
    };
};

numeric_range range(int max){
  return numeric_range(max);
}

Note that I've left out the definition of the iterator postincrement operator. That's because it warrants a bit more discussion.

Remember the spec for *a++: equivalent to calling

iterator::value_type increment_and_dereference(iterator& a) {
  iterator::value_type temp(*a);
  ++a;
  return temp;
}

Since this is a combination of two operators, we can't do it directly: instead, our postincrement operator has to return a special type to hold the temporary value. Our postincrement operator thus looks like this:

class numeric_range::iterator{
  class postinc_return{
    int value;
  public:
    postinc_return(int value_): value(value_){}
    int operator*(){ return value; }
  };
public:
  postinc_return operator++(int){
    postinc_return temp(range->current);
    ++*this;
    return temp;
  }
};

That's enough for our initial requirement:

int main(){
    for(int x: range(10)){
        std::cout<<x<<",";
    }
    std::cout<<std::endl;
}

will now print 0 to 9.

More complete implementation

The Python range function provides more options than just this, though: you can also specify start and end values, or start, end and step values, and the step can be negative. We might also like to handle alternative types, such as unsigned or double, and we need to ensure we handle things like empty ranges.

Alternative types is mostly easy: just make numeric_range a class template, and replace int with our new template parameter T everywhere.

Setting the initial and final values is also easy: just provide a new constructor that takes both the current and final values, rather than initializing the current value to 0.

Steps are a bit more tricky: if the final value is not initial+n*step for an integer n, then the direct comparison of current==final to check for the end of the sequence won't work, as it will always be false. We therefore need to check for current>=final, to account for overshoot. This then doesn't work for negative steps, so we need to allow for that too: if the step is negative, we check for current<=final instead.

Final code

The final code is available for download with simple examples, under the BSD license.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: cplusplus, ranges, iteration
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Just::Thread Pro v2.4 released

Friday, 10 February 2017

I am pleased to announce that Just::Thread Pro v2.4 has been released.

Just::Thread Pro is our C++ concurrency extensions library which provides an Actor framework for easier concurrency, along with concurrent data structures: a thread-safe queue, and concurrent hash map, and a wrapper for ensuring synchronized access to single objects.

It also includes the new facilities from the Concurrency TS:

Unwrapping the future

V2.4 adds automatic future unwrapping, as specified by the Concurrency TS, so a future<T> can be constructed from a future<future<T>>. This also works with continuations, so if your continuation returns a future<T>, the return value from then is still a future<T>, whereas if your continuation returns a T that is not a future, then then returns future<T>.

This is great for continuation-style concurrency, as it allows you to compose continuations easily, and use asynchronous calls within continuations without adding an additional layer of future-wrapping.

std::experimental::future<InitialData> first_task();
std::experimental::future<IntermediateData> second_task(InitialData data);
std::experimental::future<FinalData> third_task(IntermediateData data);

std::experimental::future<FinalData> f=first_task.then([](auto data){
    return second_task(data.get());
}).then([](auto data){
    return third_task(data.get());
});

New compiler/OS support

V2.4 also adds support for gcc 5 on Fedora 22 and 23, and gcc 6 on Fedora 24 and 25.

Just::Thread Pro is now fully supported on the following compiler/OS combinations (32-bit and 64-bit):

Microsoft Visual Studio 2015 for Windows
gcc 5 for Ubuntu 14.04 or later
gcc 6 for Ubuntu 14.04 or later
gcc 5 on Fedora 22 and 23
gcc 6 on Fedora 24 and 25

Just::Thread Pro v2.2 is also supported with the Just::Thread compatibility library on the following compiler/OS combinations:

Microsoft Visual Studio 2005, 2008, 2010, 2012 and 2013 for Windows
TDM gcc 4.5.2, 4.6.1 and 4.8.1 for Windows
g++ 4.3 or later for Ubuntu 9.04 or later
g++ 4.4 or later for Fedora 13 or later
g++ 4.4 for Centos 6
MacPorts g++ 4.3 to 4.8 on MacOSX Snow Leopard or later

All licences include a free upgrade to point releases, so if you purchase now you'll get a free upgrade to all 2.x releases of Just::Thread Pro.

Posted by Anthony Williams
[/ news /] permanent link
Tags: just::thread, release, concurrency, multithreading
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Wrapping Callbacks with Futures

Friday, 03 February 2017

Libraries the perform time-consuming operations, or network-based operations, often provide a means of running the task asynchronously, so that your code can continue with other things while this operation is performed in the background.

Such functions often allow you to provide a callback which is invoked when the operation has completed. The result of the operation may be supplied directly to the callback, or the callback might be expected to make further calls into the library to detect the result.

Either way, though it is a useful facility, it doesn't work well with code that needs to wait for the result — for this, you need something that can be waited on. Futures are an ideal such mechanism, because they provide a common mechanism for waiting that is abstracted away from the details of this specific library. They also allow the result to be transferred directly to the waiting thread, again abstracting away the details.

So, what do you do if the library you want to use provides a callback facility, and not a future-based wait facility? You wrap the callback in a future.

Promises, promises

The key to wrapping a callback in a future is the promise. A promise is the producer side of a future: you set the value through the promise in order to consume it via the future.

In C++, promises are provided through the std::promise class template; the template parameter specifies the type of the data being transferred, and thus is the same as the template parameter of the std::future that you will eventually be returning to the user. When you call prom.set_value() on some promise prom, then the corresponding future (retrieved by calling prom.get_future()) will become ready, holding the relevant value.

Promises and futures are not unique to C++; they are available in at least JavaScript, Python, Java and Scala, and pretty much any modern concurrency library for any language will have something that is equivalent. Adapting the examples to your favourite language is left as an exercise for the reader.

Simple callbacks

The most convenient case for us when trying to wrap a function that takes a callback is where the function we are calling takes anything that is callable as the callback. In C++ this might be represented as an instantiation of std::function. e.g.

class request_data;
class retrieved_data;

void async_retrieve_data(
    request_data param,
    std::function<void(retrieved_data)> callback);

What we need to do to wrap this is the following:

Create a promise.
Get the future from the promise.
Construct a callable object that can hold the promise, and will set the value on the promise when called.
Pass that object as the callback.
Return the future that we obtained in step 2.

This is the same basic set of steps as we'll be doing in all the examples that follow; it is the details that will differ.

Note: std::function requires that the callable object it wraps is copyable (so that if the std::function object itself is copied, it can copy the wrapped callable object), so we cannot hold the std::promise itself by value, as promises are not copyable.

We can thus write this using a C++ lambda as the callable object:

std::future<retrieved_data> wrapped_retrieve_data(request_data param) {
    std::shared_ptr<std::promise<retrieved_data>> prom=
        std::make_shared<std::promise<retrieved_data>>();
    std::future<retrieved_data> res=prom->get_future();
    async_retrieve_data(
        param,
        [prom](retrieved_data result){
            prom->set_value(result);
    });
    return res;
}

Here, we're using a std::shared_ptr to provide a copyable wrapper for the promise, so that it can be copied into the lambda, and the lambda itself will be copyable. When the copy of the lambda is called, it sets the value on the promise through its copy of the std::shared_ptr, and the future that is returned from wrapped_retrieve_data will become ready.

That's all very well if the function uses something like std::function for the callback. However, in practice that's not often the case. More often you have something that takes a plain function and a parameter to pass to this function; an approach inherited from C. Indeed, many APIs that you might wish to wrap are C APIs.

Plain function callbacks with a `user_data` parameter

A function that takes a plain function for the callback and a user_data parameter to pass to the function often looks something like this:

void async_retrieve_data(
    request_param param,
    void (*callback)(uintptr_t user_data,retrieved_data data),
    uintptr_t user_data);

The user_data you supply to async_retrieve_data is passed as the first parameter of your callback when the data is ready.

In this case, wrapping out callback is a bit more tricky, as we cannot just pass our lambda directly. Instead, we must create an object, and pass something to identify that object via the user_data parameter. Since our user_data is uintptr_t, it is large enough to hold a pointer, so we can cast the pointer to our object to uintptr_t, and pass it as the user_data. Our callback can then cast it back before using it. This is a common approach when passing C++ objects through C APIs.

The issue is: what object should we pass a pointer to, and how will its lifetime be managed?

One option is to just allocate our std::promise on the heap, and pass the pointer to that:

void wrapped_retrieve_data_callback(uintptr_t user_data,retrieved_data data) {
    std::unique_ptr<std::promise<retrieved_data>> prom(
        reinterpret_cast<std::promise<retrieved_data>*>(user_data));
    prom->set_value(data);
}

std::future<retrieved_data> wrapped_retrieve_data(request_data param) {
    std::unique_ptr<std::promise<retrieved_data>> prom=
        std::make_unique<std::promise<retrieved_data>>();
    std::future<retrieved_data> res=prom->get_future();
    async_retrieve_data(
        param,
        wrapped_retrieve_data_callback,
        reinterpret_cast<uintptr_t>(prom->get()));
    prom.release();
    return res;
}

Here, we use std::make_unique to construct our promise, and give us a std::unique_ptr pointing to it. We then get the future as before, and call the function we're wrapping, passing in the raw pointer, cast to an integer. We then call release on our pointer, so the object isn't deleted when we return from the function.

In our callback, we then cast the user_data parameter back to a pointer, and construct a new std::unique_ptr object to take ownership of it, and ensure it gets deleted. We can then set the value on our promise as before.

This is a little bit more convoluted than the lamda version from before, but it works in more cases. Often the APIs will take a void* rather than a uintptr_t, in which case you only need a static_cast rather than the scary reinterpret_cast, but the structure is the same.

An alternative to heap-allocating the promise directly is to store it in a global container (e.g. a std::list<std::promise<T>>), provided that its address can't change. You then need to ensure that it gets destroyed at a suitable point, otherwise you'll end up with a container full of used promises.

If you've only got C++11 futures, then the advantages to wrapping the callback-based API like so is primarily about abstracting away the interface, and providing a means of waiting for the result. However, if your library provides the extended futures from the C++ Concurrency TS then you can benefit from continuations to add additional functions to call when the data is ready, without having to modify the callback.

Summary

Wrapping asynchronous functions that take callbacks with futures provides a nice abstraction boundary to separate the details of the API call from the rest of your code. It is a common pattern in all languages that provide the future/promise abstraction, especially where that abstraction allows for continuations.

If you have any thoughts on this, let me know in the comments below.

Posted by Anthony Williams
[/ threading /] permanent link
Tags: callbacks, async, futures, threading, concurrency
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Previous Entries Later Entries

Just Software Solutions

About Us

Technical Writings

Subscribe to Blog

Blog Archives

Blog Archive for / 2017 / 02 /

Generating Sequences

Monday, 27 February 2017

Iterating over a "virtual" container

Input Iterator requirements

`iterator_category`

`value_type`

`reference`

`pointer`

`difference_type`

Input Iterator expression requirements

Comparisons

Incrementing

Dereferencing

Basic implementation

More complete implementation

Final code

Just::Thread Pro v2.4 released

Friday, 10 February 2017

Unwrapping the future

New compiler/OS support

Wrapping Callbacks with Futures

Friday, 03 February 2017

Promises, promises

Simple callbacks

Plain function callbacks with a `user_data` parameter

Summary

Just Software Solutions

About Us

Technical Writings

Subscribe to Blog

Blog Archives

Blog Archive for / 2017 / 02 /

Generating Sequences

Monday, 27 February 2017

Iterating over a "virtual" container

Input Iterator requirements

iterator_category

value_type

reference

pointer

difference_type

Input Iterator expression requirements

Comparisons

Incrementing

Dereferencing

Basic implementation

More complete implementation

Final code

Just::Thread Pro v2.4 released

Friday, 10 February 2017

Unwrapping the future

New compiler/OS support

Wrapping Callbacks with Futures

Friday, 03 February 2017

Promises, promises

Simple callbacks

Plain function callbacks with a user_data parameter

Summary

`iterator_category`

`value_type`

`reference`

`pointer`

`difference_type`

Plain function callbacks with a `user_data` parameter