Blog Archive for / cplusplus /

New C++ Working Draft and Concurrency Papers Now Available

Wednesday, 02 July 2008

The post-meeting mailing following June's C++ Standards committee meeting in France is now available. This includes a new Working Draft for the C++0x standard, and a few concurrency-related papers.

From a concurrency point of view, there are several papers of interest. Firstly, a few have been accepted into the working draft, notably:

N2661: A Foundation to Sleep On
This paper provides a generalised time point and duration library, which is used by the thread functions that take times or durations. These have been updated to use these new types and renamed to make their purpose clearer: functions that wait for a duration are now called xxx_for, and take a value of type std::chrono::duration<Rep,Period>, whereas those that take absolute time points are now called xxx_until and take a value of type std::chrono::time_point<Clock,Duration>.
N2668: Concurrency Modifications to Basic String
The changes in this paper ensure that it is safe for two threads to access the same std::string object at the same time, provided they both perform only read operations. They also ensure that copying a string object and then modifying that copy is safe, even if another thread is accessing the original. This essentially disallows copy-on-write implementations since the benefits are now severely limited.
N2660: Dynamic Initialization and Destruction with Concurrency
With the changes from this paper, if an application uses multiple threads then the initialization and destruction of objects with static storage duration (such as global variables) may run concurrently on separate threads. This can provide faster start-up and shut-down times for an application, but it can also introduce the possibility of race conditions where none existed previously. If you use threads in your application, it is now even more important to check the initialization order of objects with static storage duration.
N2514: Implicit Conversion Operators for Atomics
With this change, the atomic types such as std::atomic_int are implicitly convertible to their corresponding fundamental types. This means, for example, that:
std::atomic_int x;
int y=x;
is well-formed where it wasn't previously. The implicit conversions are equivalent to calling the load() member function, and have memory_order_seq_cst ordering semantics.
N2674: Shared_ptr atomic access, revision 1
This paper introduces a new set of overloads of the free functions for atomic operations (such as atomic_load and atomic_store), which operate on instances of std::shared_ptr<>. This allows one thread to read an instance of std::shared_ptr whilst another thread is modifying that same instance if they both use the new atomic functions.
This paper also renames atomic_swap operations to atomic_exchange (and likewise for atomic_compare_swap and the corresponding member functions) for all atomic types, in order to avoid confusion with other types that provide swap functions. The atomic exchange operations only alter the value of a single object, replacing the old value with a new one, they do not exchange the values of two objects in the way that std::swap does.
N2664: C++ Data-Dependency Ordering: Atomics and Memory Model
With the adoption of this paper the memory model gets a new ordering option: memory_order_consume. This is a limited form of memory_order_acquire which allows for data-dependent ordering. If a thread uses memory_order_consume, then it is not guaranteed to see modifications to other variables made by the thread that performed the releasing operation unless those variables are accessed in conjunction with the consumed variable. This means, for example, that member variables of an object are visible if the consumed value is a pointer to that object, but that values of independent objects are not necessarily visible. This allows the compiler to perform some optimizations that are forbidden by memory_order_acquire, and reduces the synchronization overhead on some hardware architectures.
N2678: Error Handling Specification for Chapter 30 (Threads)
This paper brings the exceptions thrown by the thread under the new system_error umbrella, with corresponding error codes and error categories.
N2669: Thread-Safety in the Standard Library (Rev 2)
Now the standard supports threads, we need to say which standard library operations are thread-safe, and which are not. This paper basically says that non-modifying operations on the same object are safe, and any operations on separate objects are also safe. Also, separate threads may call the same library functions on separate objects without problems. As you might expect, concurrent modifications to the same object are data races and undefined behaviour.

The committee also voted to include N2659: Thread-Local Storage in C++0x, but it doesn't appear to be in the current draft. This paper introduces the thread_local keyword to indicate that each thread should have its own copy of a given object.

Finally, N2657: Local and Unnamed Types as Template Arguments has been incorporated in the working paper. Though this isn't directly concurrency related, it is something I've been campaigning for since N1427 back in 2003.

Apart from N2657, I've only listed the concurrency changes: check out the Working Draft for the C++0x standard, and the State of C++ Evolution for more details on the changes.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: ,
Digg This | Save to del.icio.us | Stumble It! | Submit to Reddit | Submit to DZone

The C++ Performance TR is now publicly available

Wednesday, 08 August 2007

The C++ Performance TR is a Technical Report issued by the C++ Standards committee detailing various factors that affect the performance of a program written in C++.

This includes information on various strategies of implementing aspects of the language, along with their consequences for executable size and timing, as well as suggestions on how to write efficient code. It also includes information on use of C++ in embedded systems, including a discussion of putting constant data in ROM and direct access to hardware registers.

Whether you're a compiler writer, library writer or application developer, this is well worth a look. Download a copy from the ISO website today.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Digg This | Save to del.icio.us | Stumble It! | Submit to Reddit | Submit to DZone

Chaining Constructors in C++

Monday, 18 June 2007

Chain Constructors is one of the refactorings from Refactoring to Patterns (page 340) designed to reduce duplication — in this case duplication between constructors. Unfortunately, it is not such a straight-forward refactoring in C++, since in the current Standard it is not possible for one constructor to delegate to another.

The proposal to the C++ committee to support this feature in the next C++ Standard has been accepted, but the next Standard won't be ready until 2009, with implementations available sometime after that. If you've got a problem in your current project for which this is an appropriate refactoring then two years or more is a bit long too wait. So, with that in mind, I'm posting my work-around here for those that would like this feature now.

Adding a layer of redirection

All problems in Software can be solved by adding a layer of redirection, and this is no exception. In this case, we add a level of redirection between the class and its data by wrapping the data in a private struct. Every constructor of the original class then delegates to the constructor(s) of the internal struct. I'll illustrate with one of the examples from the Standardization proposal:

class X
{
    struct internal
    {
        internal( int, W& );
        ~internal();
        Y y_;
        Z z_;
    } self;
public:
    X();
    X( int );
    X( W& );
};

X::internal::internal( int i, W& e ):
    y_(i), z_(e)
{
    /*Common Init*/
}

X::X():
    self( 42, 3.14 )
{
    SomePostInitialization();
}

X::X( int i ):
    self( i, 3.14 )
{
    OtherPostInitialization();
}

X::X( W& w ):
    self( 53, w )
{ /* no post-init */ }

X x( 21 ); // if the construction of y_ or z_ throws, internal::~internal is invoked

Every constructor of class X has to initialize the sole data member self, the constructor of which encapsulates all the common initialization. Each delegating constructor is then free to do any additional initialization required.

Within the member functions of X, all references to member data now have to be prefixed with self., but that's not too bad a price — it makes it clear that this is member data, and is analagous to the use of this->, or the m_ prefix.

This simple solution only provides for a single layer of delegation — multiple layers of delegation would require multiple layers of nested structs, but it does provide full support at that level.

pimpls and Compilation Firewalls

Once the data has been encapsulated in a private structure, a further step worth considering is a move to the use of a pointer to the internal structure, also known as the pimpl idiom, or the use of a compilation firewall. By so doing, all that is required in the class definition is a forward declaration of the internal class, rather than a full definition. The full definition is then provided in the implementation file for the enclosing class. This eliminates any dependencies on the internal data from other classes, at the cost of forcing the data to be heap allocated. It also removes the possibility of any operations on the enclosing class being inline. For further discussion on the pimpl idiom, see Herb Sutter's Guru of the Week entry.

Refactoring steps

Here's a quick summary of the steps needed to perform this refactoring:

  1. Create a private struct named internal in the class X being refactored with an identical set of data members to class X.
  2. Create a data member in class X of type internal named self, and remove all other data members.
  3. For each constructor of X, write a constructor of internal that mirrors the member-initializer list, and replace the member initializer list of that constructor with a single initialization of self that forwards the appropriate constructor parameters.
  4. Replace every reference to a data member y of class X to a reference to self.y.
  5. Eliminate duplication.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Digg This | Save to del.icio.us | Stumble It! | Submit to Reddit | Submit to DZone

Vector-based Bowling Scores in C++

Tuesday, 02 May 2006

Introduction

In one of his recent articles, Ron Jeffries looks at a vector-based method of calculating scores for Tenpin bowling. He uses Ruby, and makes two comments which inspired me to redo it in C++.

First, he says:

To begin with, did you notice that every method is only one line long? That's a characteristic of good Smalltalk code, and good Ruby code, in my opinion. You can't get there in Java or C# -- much less C++

I disagree — good C++ will have short functions, and many will only be one line. This is borne out by my sample code.

Ron also says:

I'm not sure what would happen if we tried to build a vector-oriented solution in Java or C#. I'm sure it would be larger, and that we'd have to put a bunch of code in "our" classes instead of in Array and Object. On the other hand, I'm confident that we could have our score method look very much like the one we have here, processing the vectors "all at once" in a very similar way.

I'm using C++ rather than Java or C#, but I hope Ron will still be interested. I did write versions of accumulate that accept a whole vector at once, rather than requiring a pair of iterators, but otherwise it's all idiomatic C++.

Discussion

As you can see from the code (below), there is only one functions which is more than one line long. This is Game::frame_starts, and the reason for this is the lack of the built-in numeric range generation one gets with Ruby's 1..9 syntax. With such a mechanism, we could make it similar to Game::frame_scores.

Given the range-based versions of accumulate I've added, we could add make_numeric_range to return a range containing the required integers, so Game::frame_starts could be made one line, but the code for this would end up being longer than Game::frame_starts is at the moment.

Of course, as Game::frame_scores demonstrates, the code has to go somewhere — in the case of Game::frame_scores, this is the nested class Game::AddFrameScore, and its operator(). If C++ had lambdas (which hopefully it will, when the next standard comes out), then we could include this code directly in the call to std::accumulate, but as it is, we need a whole new class. Member classes don't have the access to the instance members of their parent class that Java's inner classes enjoy, so we need to keep the reference to the Game object explicitly.

At 75 lines, including blank lines, the code is only marginally longer than the 72 lines for Ron's Ruby version, and just as clear, to my mind.

The code

Here it is, in all its glory. First the implementation:

#include <numeric>
#include <vector>
    
template<typename Range,typename Accum>
Accum accumulate(Range const& range,Accum initial) {
    return std::accumulate(range.begin(),range.end(),initial);
}

template<typename Range,typename Accum,typename Pred>
Accum accumulate(Range const& range,Accum initial,Pred pred) {
    return std::accumulate(range.begin(),range.end(),initial,pred);
}

class Game {
    std::vector<unsigned> rolls;

    struct AddFrameScore {
        Game const& game;
    
        explicit AddFrameScore(Game const& game_):
            game(game_)
        {}
    
        std::vector<unsigned> operator()(std::vector<unsigned> res,unsigned first_roll) {
            return res.push_back(game.frame_score(first_roll)), res;
        }
    };
public:
    static unsigned const frame_count=10;

    template<unsigned Size>
    explicit Game(unsigned const(& rolls_)[Size]):
        rolls(rolls_,rolls_+Size)
    {}

    unsigned score() const {
        return accumulate(frame_scores(),0);
    }

    unsigned is_strike(unsigned first_roll) const {
        return rolls[first_roll]==10;
    }
    
    unsigned is_spare(unsigned first_roll) const {
        return (rolls[first_roll]+rolls[first_roll+1])==10;
    }

    unsigned is_mark(unsigned first_roll) const {
        return is_strike(first_roll) || is_spare(first_roll);
    }

    unsigned rolls_to_score(unsigned first_roll) const {
        return is_mark(first_roll)?3:2;
    }

    unsigned rolls_in_frame(unsigned first_roll) const {
        return is_strike(first_roll)?1:2;
    }

    unsigned frame_score(unsigned first_roll) const {
        return std::accumulate(&rolls[first_roll],&rolls[first_roll+rolls_to_score(first_roll)],0);
    }

    std::vector<unsigned> frame_starts() const {
        std::vector<unsigned> res;
        for(unsigned i=0;res.size()<frame_count;i+=rolls_in_frame(i)) {
            res.push_back(i);
        }
        return res;
    }

    std::vector<unsigned> frame_scores() const {
        return ::accumulate(frame_starts(),std::vector<unsigned>(),AddFrameScore(*this));
    }
};

And now the tests:

#include <algorithm>
#include <iostream>

#define ASSERT_EQUALS(lhs,rhs)                                          \
    {                                                                   \
        if(lhs!=rhs) {                                                  \
            std::cerr<<__FILE__<<": "<<__LINE__                         \
                     <<": Error: Assertion failed: " #lhs "==" #rhs ", lhs=" \
                     <<lhs<<", rhs="<<rhs<<std::endl;                   \
        }                                                               \
    }

template<typename T>
std::ostream& operator<<(std::ostream& os,std::vector<T> const& vec) {
    os<<"{";
    for(typename std::vector<T>::const_iterator it=vec.begin(),
            end=vec.end();
        it!=end;
        ++it) {
        os<<*it<<",";
    }
    return os<<"}";
}

template<unsigned Size>
std::ostream& operator<<(std::ostream& os,unsigned const (& vec)[Size]) {
    os<<"{";
    for(unsigned i=0;i<Size;++i) {
        os<<vec[i]<<",";
    }
    return os<<"}";
}


template<typename LhsRange,unsigned RhsRangeSize>
bool operator!=(LhsRange const& lhs,unsigned const(& rhs)[RhsRangeSize]) {
    return (std::distance(lhs.begin(),lhs.end()) != RhsRangeSize) ||
        !std::equal(lhs.begin(),lhs.end(),rhs);
}

int main() {
    unsigned const full_game_rolls=20;
    unsigned const all_zeros[full_game_rolls]={0};
    ASSERT_EQUALS(Game(all_zeros).score(),0);
    unsigned const all_open[]={1,2,2,6,3,2,4,1,5,4,6,0,7,2,8,0,9,0,0,2};
    ASSERT_EQUALS(Game(all_open).score(),64);
    unsigned const all_open_frame_starts[]={0,2,4,6,8,10,12,14,16,18};
    ASSERT_EQUALS(Game(all_open).frame_starts(),all_open_frame_starts);
    for(unsigned i=0;i<full_game_rolls;i+=2) {
        ASSERT_EQUALS(Game(all_open).rolls_to_score(i),2);
    }
    unsigned const spare[full_game_rolls]={6,4,6,2};
    ASSERT_EQUALS(Game(spare).rolls_to_score(0),3);
    ASSERT_EQUALS(Game(spare).rolls_to_score(2),2);
    unsigned const all_open_frame_scores[Game::frame_count]={3,8,5,5,9,6,9,8,9,2};
    ASSERT_EQUALS(Game(all_open).frame_scores(),all_open_frame_scores);
    unsigned const spare_frame_scores[Game::frame_count]={16,8};
    ASSERT_EQUALS(Game(spare).frame_scores(),spare_frame_scores);
    ASSERT_EQUALS(Game(spare).score(),24);

    unsigned const strike[full_game_rolls-1]={10,6,2};
    ASSERT_EQUALS(Game(strike).rolls_to_score(0),3);
    ASSERT_EQUALS(Game(strike).rolls_to_score(1),2);
    unsigned const strike_frame_starts[]={0,1,3,5,7,9,11,13,15,17};
    ASSERT_EQUALS(Game(strike).frame_starts(),strike_frame_starts);

    unsigned const alternating[]={10,1,9,10,1,9,10,1,9,10,1,9,10,1,9,10};
    ASSERT_EQUALS(Game(alternating).score(),200);

    unsigned const all_strikes[]={10,10,10,10,10,10,10,10,10,10,10,10};
    ASSERT_EQUALS(Game(all_strikes).score(),300);
    
}

Posted by Anthony Williams
[/ cplusplus /] permanent link
Digg This | Save to del.icio.us | Stumble It! | Submit to Reddit | Submit to DZone