Just Software Solutions

Blog Archive for / 2019 /

strong_typedef - Create distinct types for distinct purposes

Wednesday, 29 May 2019

One common problem in C++ code is the use of simple types for many things: a std::string might be a filename, a person's name, a SQL query string or a piece of JSON; an int could be a count, an index, an ID number, or even a file handle. In his 1999 book "Refactoring" (which has a second edition as of January 2019), Martin Fowler called this phenomenon "Primitive Obsession", and recommended that we use dedicated classes for each purpose rather than built-in or library types.

The difficulty with doing so is that built-in types and library types have predefined sets of operations that can be done with them from simple operations like incrementing/decrementing and comparing, to more complex ones such as replacing substrings. Creating a new class each time means that we have to write implementations for all these functions every time. This duplication of effort raises the barrier to doing this, and means that we often decide that it isn't worthwhile.

However, by sticking to the built-in and library types, we can end up in a scenario where a function takes multiple parameters of the same type, with distinct meanings, and no clear reason for any specific ordering. In such a scenario, it is easy to get the parameters in the wrong order and not notice until something breaks. By wrapping the primitive type in a unique type for each usage we can eliminate this class of problem.

My strong_typedef class template aims to make this easier. It wraps an existing type, and associates it with a tag type to define the purpose, and which can therefore be used to make it unique. Crucially, it then allows you to specify which sets of operations you want to enable: it might not make sense to add ID numbers, but it might make perfect sense to add counters, even if both are represented by integers. You might therefore using jss::strong_typedef<struct IdTag,unsigned,jss::strong_typedef_properties::equality_comparable> for an ID number, but jss::strong_typedef<struct IndexTag,unsigned,jss::strong_typedef_properties::comparable | jss::strong_typedef_properties::incrementable | jss::strong_typedef_properties::decrementable> for an index type.

I've implemented something similar to this class for various clients over the years, so I decided it was about time to make it publicly available. The implementation on github condenses all of the solutions to this problem that I've written over the years to provide a generic implementation.

Basic Usage

jss::strong_typedef takes three template parameters: Tag, ValueType and Properties.

The first (Tag) is a tag type. This is not used for anything other than to make the type unique, and can be incomplete. Most commonly, this is a class or struct declared directly in the template parameter, and nowhere else, as in the examples struct IdTag and struct IndexTag above.

The second (ValueType) is the underlying type of the strong typedef. This is the basic type that you would otherwise be using.

The third (Properties) is an optional parameter that specifies the operations you wish the strong typedef to support. By default it is jss::strong_typedef_properties::none — no operations are supported. See below for a full list.

Declaring Types

You create a typedef by specifying these parameters:

using type1=jss::strong_typedef<struct type1_tag,int>;
using type2=jss::strong_typedef<struct type2_tag,int>;
using type3=jss::strong_typedef<struct type3_tag,std::string,
    jss::strong_typedef_properties::comparable>;

type1, type2 and type3 are now separate types. They cannot be implicitly converted to or from each other or anything else.

Creating Values

If the underlying type is default-constructible, then so is the new type. You can also construct the objects from an object of the wrapped type:

type1 t1;
type2 t2(42);
// type2 e2(t1); // error, type1 cannot be converted to type2

Accessing the Value

strong_typedef can wrap built-in or class type, but that's only useful if you can access the value. There are two ways to access the value:

  • Cast to the stored type: static_cast<unsigned>(my_channel_index)
  • Use the underlying_value member function: my_channel_index.underlying_value()

Using the underlying_value member function returns a reference to the stored value, which can thus be used to modify non-const values, or to call member functions on the stored value without taking a copy. This makes it particularly useful for class types such as std::string.

using transaction_id=jss::strong_typedef<struct transaction_tag,std::string>;

bool is_a_foo(transaction_id id){
    auto& s=id.underlying_value();
    return s.find("foo")!=s.end();
}

Other Operations

Depending on the properties you've assigned to your type you may be able to do other operations on that type, such as compare a == b or a < b, increment with ++a, or add two values with a + b. You might also be able to hash the values with std::hash<my_typedef>, or write them to a std::ostream with os << a. Only the behaviours enabled by the Properties template parameter will be available on any given type. For anything else, you need to extract the wrapped value and use that.

Examples

IDs

An ID of some description might essentially be a number, but it makes no sense to perform much in the way of operations on it. You probably want to be able to compare IDs, possibly with an ordering so you can use them as keys in a std::map, or with hashing so you can use them as keys in std::unordered_map, and maybe you want to be able to write them to a stream. Such an ID type might be declared as follows:

using widget_id=jss::strong_typedef<struct widget_id_tag,unsigned long long,
    jss::strong_typedef_properties::comparable |
    jss::strong_typedef_properties::hashable |
    jss::strong_typedef_properties::streamable>;

using froob_id=jss::strong_typedef<struct froob_id_tag,unsigned long long,
    jss::strong_typedef_properties::comparable |
    jss::strong_typedef_properties::hashable |
    jss::strong_typedef_properties::streamable>;

Note that froob_id and widget_id are now different types due to the different tags used, even though they are both based on unsigned long long. Therefore any attempt to use a widget_id as a froob_id or vice-versa will lead to a compiler error. It also means you can overload on them:

void do_stuff(widget_id my_widget);
void do_stuff(froob_id my_froob);

widget_id some_widget(421982);
do_stuff(some_widget);

Alternatively, an ID might be a string, such as a purchase order number of transaction ID:

using transaction_id=jss::strong_typedef<struct transaction_id_tag,std::string,
    jss::strong_typedef_properties::comparable |
    jss::strong_typedef_properties::hashable |
    jss::strong_typedef_properties::streamable>;

transaction_id some_transaction("GBA283-HT9X");

That works too, since strong_typedef can wrap any built-in or class type.

Indexes

Suppose you have a device that supports a number of channels, so you want to be able to retrieve the data for a given channel. Each channel yields a number of data items, so you also want to access the data items by index. You could use strong_typedef to wrap the channel index and the data item index, so they can't be confused. You can also make the index types incrementable and decrementable so they can be used in a for loop:

using channel_index=jss::strong_typedef<struct channel_index_tag,unsigned,
    jss::strong_typedef_properties::comparable |
    jss::strong_typedef_properties::incrementable |
    jss::strong_typedef_properties::decrementable>;

using data_index=jss::strong_typedef<struct data_index_tag,unsigned,
    jss::strong_typedef_properties::comparable |
    jss::strong_typedef_properties::incrementable |
    jss::strong_typedef_properties::decrementable>;

Data get_data_item(channel_index channel,data_index item);
data_index get_num_items(channel_index channel);
void process_data(Data data);

void foo(){
    channel_index const num_channels(99);
    for(channel_index channel(0);channel<num_channels;++channel){
        data_index const num_data_items(get_num_items(channel));
        for(data_index item(0);item<num_data_items;++item){
            process_data(get_data_item(channel,item));
        }
    }
}

The compiler will complain if you pass the wrong parameters, or compare the channel against the item.

Behaviour Properties

The Properties parameter specifies behavioural properties for the new type. It must be one of the values of jss::strong_typedef_properties, or a value obtained by or-ing them together (e.g. jss::strong_typedef_properties::hashable | jss::strong_typedef_properties::streamable | jss::strong_typedef_properties::comparable). Each property adds some behaviour. The available properties are:

  • equality_comparable => Can be compared for equality (st==st2) and inequality (st!=st2)
  • pre_incrementable => Supports preincrement (++st)
  • post_incrementable => Supports postincrement (st++)
  • pre_decrementable => Supports predecrement (--st)
  • post_decrementable => Supports postdecrement (st--)
  • addable => Supports addition (st+value, value+st, st+st2) where the result is convertible to the underlying type. The result is a new instance of the strong typedef.
  • subtractable => Supports subtraction (st-value, value-st, st-st2) where the result is convertible to the underlying type. The result is a new instance of the strong typedef.
  • ordered => Supports ordering comparisons (st<st2, st>st2, st<=st2, st>=st2)
  • mixed_ordered => Supports ordering comparisons where only one of the values is a strong typedef
  • hashable => Supports hashing with std::hash
  • streamable => Can be written to a std::ostream with operator<<
  • incrementable => pre_incrementable | post_incrementable
  • decrementable => pre_decrementable | post_decrementable
  • comparable => ordered | equality_comparable

Guideline and Implementation

I strongly recommend using strong_typedef or an equivalent implementation anywhere you would otherwise reach for a built-in or library type such as int or std::string when designing an interface.

My strong_typedef implementation is available on github under the Boost Software License.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

ACCU 2019 Slides and Trip Report

Monday, 22 April 2019

I attended ACCU 2019 a couple of weeks ago, where I was presenting my session Here's my number; call me, maybe. Callbacks in a multithreaded world.

The conference proper started on Wednesday, after a day of pre-conference workshops on the Tuesday, and continued until Saturday. I was only there Wednesday to Friday.

Wednesday

I didn't arrive until Wednesday lunchtime, so I missed the first keynote and morning sessions. I did, however get to see Ivan Čukić presenting his session on Ranges for distributed and asynchronous systems. This was an interesting talk that covered similar ground to things I've thought about before. It was good to see Ivan's take, and think about how it differed to mine. It was was also good to see how modern C++ techniques can produce simpler code than I had when I thought about this a few years ago. Ivan's approach is a clean design for pipelined tasks that allows implicit parallelism.

After the break I then went to Gail Ollis's presentation and workshop on Helping Developers to Help Each Other . Gail shared some of her research into how developers feel about various aspects of software development, from the behaviour of others to the code that they write. She then got us to try one of the exercises she talked about in small groups. By picking developer behaviours from the cards she provided to each group, and telling stories about how that behaviour has affected us, either positively or negatively, we can share our experiences, and learn from each other.

Thursday

First up on Thursday was Herb Sutter's keynote: De-fragmenting C++: Making exceptions more affordable and usable . Herb was eloquent as always, talking about his idea for making exceptions in C++ lower cost, so that they can be used in all projects: a significant number of projects currently ban exceptions from at least some of their code. I think this is a worthwhile aim, and hope to see something like Herb's ideas get accepted for C++ in a future standard.

Next up was my session, Here's my number; call me, maybe. Callbacks in a multithreaded world. It was well attended, with interesting questions from the audience. My slides are available here, and the video is available on youtube. Several people came up to me later in the conference to say that they had enjoyed my talk, and that they thought it would be useful for them in their work, which pleased me no end: this is what I always hope to achieve from my presentations.

Thursday lunchtime was taken up with book signings. I was one of four authors of recently-published programming books set up in the conservatory area of the hotel to sell copies of our books, and sign books for people. I sold plenty, and signed more, which was great.

Kate Gregory's talk on What Do We Mean When We Say Nothing At All? was after lunch. She discussed the various places in C++ where we can choose to specify something (such as const, virtual, or explicit), but we don't have to. Can we interpret meaning from the lack of an annotation? If your codebase uses override everywhere, except in one place, is that an accidental omission, or is it a flag to say "this isn't actually an override of the base class function"? Is it a good or bad idea to omit the names of unused parameters? There was a lot to think about with this talk, but the key takeaway for me is Consistency is Key: if you are consistent in your use of optional annotations, then deviation from your usual pattern can convey meaning to the reader, whereas if you are inconsistent then the reader cannot infer anything.

The final session I attended on Thursday was the C++ Pub Quiz, which was hosted by Felix Petriconi. The presented code was intended to confuse, and elicit exclamations of "WTF!", and succeeded on both counts. However, it was fun as ever, helped by the free drinks, and the fact that my team "Ungarian Notation" were the eventual winners.

Friday

Friday was the last day of the conference for me (though there the conference had another full day on Saturday). It started with Paul Grenyer's keynote on the trials and tribulations of trying to form a "community" for developers in Norwich, with meet-ups and conferences. Paul managed to be entertaining, but having followed Paul's blog for a few years, there wasn't anything that was new to me.

Interactive C++ : Meet Jupyter / Cling - The data scientist's geeky younger sibling was the next session I attended, presented by Neil Horlock. This was an interesting session about cling, a C++ interpreter, complete with a REPL, and how this can be combined with Jupyter notebooks to create a wiki with embedded code that you can edit and run. Support for various libraries allows to write code to plot graphs and maps and things, and have the graphs appear right there in the web page immediately. This is an incredibly powerful tool, and I had discussions with people afterwards about how this could be used both as an educational tool, and for "live" documentation and customer-facing tests: "here is sample code, try it out right now" is an incredibly powerful thing to be able to say.

After lunch I went to see Andreas Weis talk about Taming Dynamic Memory - An Introduction to Custom Allocators. This was a good introduction to various simple allocators, along with how and why you might use them in your C++ code. With John Lakos in the front row, Andreas had to field many questions. I had hoped for more depth, but I thought the material was well-paced, and so there wouldn't have been time; that would have been quite a different presentation, and less of an "introduction".

The final session I attended was Elsewhere Memory by Niall Douglas. Niall talked about the C++ object model, and how that can cause difficulties for code that wants to serialize the binary representation of objects to disk, or over the network, or wants to directly share memory with another process. Niall is working on a standardization proposal which would allow creating objects "fully formed" from a binary representation, without running a constructor, and would allow terminating the lifetime of an object without running its destructor. This is a difficult area as it interacts with compilers' alias analysis and the normal deterministic lifetime rules. However, this is an area where people currently do have "working" code that violates the strict lifetime rules of the standard, so it would be good to have a way of making such code standards-conforming.

Between the Sessions

The sessions at a conference at ACCU are great, and I always enjoy attending them, and often learn things. However, you can often watch these on Youtube later. One of the best parts of physically attending a conference is the discussions had in person before and after the sessions. It is always great to chat to people in person who you primarily converse with via email, and it is exciting to meet new people.

The conference tries to encourage attendees to be open to new people joining discussions with the "Pacman rule" — don't form a closed circle when having a discussion, but leave a space for someone to join. This seemed to work well in practice.

I always have a great time at ACCU conferences, and this one was no different.

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

"The Developers" 2019 presentation and book signing

Monday, 01 April 2019

I will be presenting "Concurrency in C++20 and beyond" at The Developers 2019 in Romania on 23rd May 2019. Here is the abstract of my talk:

C++20 is set to add new facilities to make writing concurrent code easier. Some of them come from the previously published Concurrency TS, and others are new, but they all make our lives as developers easier. This talk will introduce the new features, and explain how and why we should use them.

The evolution of the C++ Concurrency support doesn't stop there though: the committee has a continuous stream of new proposals. This talk will also introduce some of the most important of these, including the new Executor model.

I will also be signing copies of the second edition of my book C++ Concurrency In Action now that it is finally in print.

I look forward to seeing you there!

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Get the element index when iterating with an indexed_view

Monday, 25 March 2019

One crucial difference between using an index-based for loop and a range-based for loop is that the former allows you to use the index for something other than just identifying the element, whereas the latter does not provide you with access to the index at all.

The difference between index-based for loops and range-based for loops means that some people are unable to use simple range-based for loops in some cases, because they need the index.

For example, you might be initializing a set of worker threads in a thread pool, and each thread needs to know it's own index:

std::vector<std::thread> workers;

void setup_workers(unsigned num_threads){
    workers.resize(num_threads);
    for(unsigned i=0;i<num_threads;++i){
        workers[i]=std::thread(&my_worker_thread_func,i);
    }
}

Even though workers has a fixed size in the loop, we need the loop index to pass to the thread function, so we cannot use range-based for. This requires that we duplicate num_threads, adding the potential for error as we must ensure that it is correctly updated in both places if we ever change it.

jss::indexed_view to the rescue

jss::indexed_view provides a means of obtaining that index with a range-based for loop: it creates a new view range which wraps the original range, where each element holds the loop index, as well as a reference to the element of the original range.

With jss::indexed_view, we can avoid the duplication from the previous example and use the range-based for:

std::vector<std::thread> workers;

void setup_workers(unsigned num_threads){
    workers.resize(num_threads);
    for(auto entry: jss::indexed_view(workers)){
        entry.value=std::thread(&my_worker_thread_func,entry.index);
    }
}

As you can see from this example, the value field is writable: it is a reference to the underlying value if the iterator on the source range is a reference. This allows you to use it to modify the elements in the source range if they are non-const.

jss::indexed_view also works with iterator-based ranges, so if you have a pair of iterators, then you can still use range-based for loops. For example, the following code processes the elements up to the first zero in the supplied vector, or the whole vector if there is no zero.

void foo(std::vector<int> const& v){
    auto end=std::find(v.begin(),v.end(),0);
    for(auto entry: jss::indexed_view(v.begin(),end)){
        process(entry.index,entry.value);
    }
}

Finally, jss::indexed_view can also be used with algorithms that require iterator-based ranges, so our first example could also be written as:

std::vector<std::thread> workers;

void setup_workers(unsigned num_threads){
    workers.resize(num_threads);
    auto view=jss::indexed_view(workers);
    std::for_each(view.begin(),view.end(),[](auto entry){
        entry.value=std::thread(&my_worker_thread_func,entry.index);
    });
}

Final words

Having to use non-ranged for loop to get the loop index introduces a potential source of error: it is easy to mistype the loop index either in the for-loop header, or when using it to get the indexed element, especially in nested loops.

By using jss::indexed_view to wrap the range, you can eliminate this particular source of error, as well as making it clear that you are iterating across the entire range, and that you need the index.

Get the source from github and use it in your project now.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , , , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

object_ptr - a safer replacement for raw pointers

Thursday, 21 March 2019

Yesterday I uploaded my object_ptr<T> implementation to github under the Boost Software License.

This is an implementation of a class similar to std::experimental::observer_ptr<T> from the Library Fundamentals TS 2, but with various improvements suggested in WG21 email discussions of the feature.

The idea of std::experimental::observer_ptr<T> is that it provides a pointer-like object that does not own the pointee, and thus can be used in place of a raw pointer, but does not allow pointer arithmetic or use of delete, or pointer arithmetic, so is not as dangerous as a raw pointer. Its use also serves as documentation: this object is owned elsewhere, so explicitly check the lifetime of the pointed-to object — there is nothing to prevent a dangling std::experimental::observer_ptr<T>.

My implementation of this concept has a different name (object_ptr<T>). I feel that observer_ptr is a bad name, because it conjures up the idea of the Observer pattern, but it doesn't really "observe" anything. I believe object_ptr is better: it is a pointer to an object, so doesn't have any array-related functionality such as pointer arithmetic, but it doesn't tell you anything about ownership.

It also has slightly different semantics to std::experimental::observer_ptr: it allows incoming implicit conversions, and drops the release() member function. The implicit conversions make it easier to use as a function parameter, without losing any safety, as you can freely pass a std::shared_ptr<T>, std::unique_ptr<T>, or even a raw pointer to a function accepting object_ptr<T>. It makes it much easier to use jss::object_ptr<T> as a drop-in replacement for T* in function parameters. There is nothing you can do with a jss::object_ptr<T> that you can't do with a T*, and in fact there is considerably less that you can do: without explicitly requesting the stored T*, you can only use it to access the pointed-to object, or compare it with other pointers. The same applies with std::shared_ptr<T> and std::unique_ptr<T>: you are reducing functionality, so this is safe, and reducing typing for safe operations is a good thing.

I strongly recommend using object_ptr<T> or an equivalent implementation of the observer_ptr concept anywhere you have a non-owning raw pointer in your codebase that points to a single object.

If you have a raw pointer that does own its pointee, then I would strongly suggest finding a smart pointer class to use as a wrapper to encapsulate that ownership. For example, std::unique_ptr or std::shared_ptr with a custom deleter might well do the job.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Begin and End with range-based for loops

Saturday, 23 February 2019

On slack the other day, someone mentioned that lots of companies don't use range-based for loops in their code because they use PascalCase identifiers, and their containers thus have Begin and End member functions rather than the expected begin and end member functions.

Having recently worked in a codebase where this was the case, I thought it would be nice to provide a solution to this problem.

The natural solution would be to provide global overloads of the begin and end functions: these are always checked by range-based for if the member functions begin() and end() are not found. However, when defining global function templates, you need to be sure that they are not too greedy: you don't want them to cause ambiguity in overload resolution or be picked in preference to std::begin or std::end.

My first thought was to jump through metaprogramming hoops checking for Begin() and End() members that return iterators, but then I thought that seemed complicated, so looked for something simpler to start with.

The simplest possible solution is just to declare the functions the same way that std::begin() and std::end() are declared:

template <class C> constexpr auto begin(C &c) -> decltype(c.Begin()) {
    return c.Begin();
}
template <class C> constexpr auto begin(const C &c) -> decltype(c.Begin()) {
    return c.Begin();
}

template <class C> constexpr auto end(C &c) -> decltype(c.End()) {
    return c.End();
}
template <class C> constexpr auto end(const C &c) -> decltype(c.End()) {
    return c.End();
}

Initially I thought that this would be too greedy, and cause problems, but it turns out this is fine.

The use of decltype(c.Begin()) triggers SFINAE, so only types which have a public member named Begin which can be invoked with empty parentheses are considered; for anything else these functions are just discarded and not considered for overload resolution.

The only way this is likely to be a problem is if the user has also defined a begin free function template for a class that has a suitable Begin member, in which case this would potentially introduce overload resolution ambiguity. However, this seems really unlikely in practice: most such function templates will end up being a better match, and any non-template functions are almost certainly a better match.

So there you have it: in this case, the simplest solution really is good enough! Just include this header and you're can freely use range-based for loops with containers that use Begin() and End() instead of begin() and end().

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

ACCU 2019 presentation and book signing

Thursday, 21 February 2019

The ACCU 2019 conference is running from 9th-13 April 2019, in Bristol, UK.

This year I will be presenting "Here's my number; call me, maybe. Callbacks in a multithreaded world" on 11th April. The abstract is:

A common pattern in multithreaded applications is the use of callbacks, continuations and task pipelines to divide the processing of data across threads. This has the benefit of ensuring that threads can quickly move on to further processing, and can minimize blocking waits, since tasks are only scheduled when there is work to be done.

The downside is that they can weave a tangled web of connections, and managing object lifetimes can now become complicated.

This presentation will look at ways of managing this complexity and ensuring that your code is as clear as possible, and there is no possibility of dangling references or leaked objects.

I will also be signing copies of the second edition of my book C++ Concurrency In Action now that it is finally in print.

I look forward to seeing you there!

Posted by Anthony Williams
[/ news /] permanent link
Tags: , , ,
Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

Previous Entries

Design and Content Copyright © 2005-2019 Just Software Solutions Ltd. All rights reserved. | Privacy Policy