Exceptions make for Elegant Code

Friday, 06 June 2008

On this week's Stack Overflow podcast, Joel comes out quite strongly against exceptions, on the basis that they are hidden flow paths. Whilst I can sympathise with the idea of making every possible control path in a routine explicitly visible, having just had to write some C code for a recent project I would really like to say that this actually makes the code a lot harder to follow, as the actual code for what it's really doing is hidden amongst a load of error checking.

Whether or not you use exceptions, you have the same number of possible flow paths. With exceptions, the code can be a lot cleaner than with exceptions, as you don't have to write a check after every function call to verify that it did indeed succeed, and you can now proceed with the rest of the function. Instead, the code tells you when it's gone wrong by throwing an exception.

Exceptions also simplify the function signature: rather than having to add an additional parameter to hold the potential error code, or to hold the function result (because the return value is used for the error code), exceptions allow the function signature to specify exactly what is appropriate for the task at hand, with errors being reported "out-of-band". Yes, some functions use errno, which helps by providing a similar out-of-band error channel, but it's not a panacea: you have to check and clear it between every call, otherwise you might be passing invalid data into subsequent functions. Also, it requires that you have a value you can use for the return type in the case that an error occurs. With exceptions you don't have to worry about either of these, as they interrupt the code at the point of the error, and you don't have to supply a return value.

Here's three implementations of the same function using error code returns, errno and exceptions:

    int foo_with_error_codes(some_type param1,other_type param2,result_type* result)
    {
        int error=0;
        intermediate_type temp;

        if((error=do_blah(param1,23,&temp)) ||
           (error=do_flibble(param2,temp,result))
        {
            return error;
        }
        return 0;
    }

    result_type foo_with_errno(some_type param1,other_type param2)
    {
        errno=0;
        intermediate_type temp=do_blah(param1,23);
        if(errno)
        {
            return dummy_result_type_value;
        }

        return do_flibble(param2,temp);
    }

    result_type foo_with_exceptions(some_type param1,other_type param2)
    {
        return do_flibble(param2,do_blah(param1,23));
    }

Error Recovery

In all three cases, I've assumed that there's no recovery required if do_blah succeeds but do_flibble fails. If recovery was required, additional code would be required. It could be argued that this is where the problems with exceptions begin, as the code paths for exceptions are hidden, and it is therefore unclear where the cleanup must be done. However, if you design your code with exceptions in mind I find you still get elegant code. try/catch blocks are ugly: this is where deterministic destruction comes into its own. By encapsulating resources, and performing changes in an exception-safe manner, you end up with elegant code that behaves gracefully in the face of exceptions, without cluttering the "happy path". Here's some code:

    int foo_with_error_codes(some_type param1,other_type param2,result_type* result)
    {
        int error=0;
        intermediate_type temp;

        if(error=do_blah(param1,23,&temp))
        {
            return error;
        }

        if(error=do_flibble(param2,temp,result))
        {
            cleanup_blah(temp);
            return error;
        }
        return 0;
    }

    result_type foo_with_errno(some_type param1,other_type param2)
    {
        errno=0;
        intermediate_type temp=do_blah(param1,23);
        if(errno)
        {
            return dummy_result_type_value;
        }

        result_type res=do_flibble(param2,temp);
        if(errno)
        {
            cleanup_blah(temp);
            return dummy_result_type_value;
        }
        return res;
    }

    result_type foo_with_exceptions(some_type param1,other_type param2)
    {
        return do_flibble(param2,do_blah(param1,23));
    }

    result_type foo_with_exceptions2(some_type param1,other_type param2)
    {
        blah_cleanup_guard temp(do_blah(param1,23));
        result_type res=do_flibble(param2,temp);
        temp.dismiss();
        return res;
    }

In the error code cases, we need to explicitly cleanup on error, by calling cleanup_blah. In the exception case we've got two possibilities, depending on how your code is structured. In foo_with_exceptions, everything is just handled directly: if do_flibble doesn't take ownership of the intermediate data, it cleans itself up. This might well be the case if do_blah returns a type that handles its own resources, such as std::string or boost::shared_ptr. If explicit cleanup might be required, we can write a resource management class such as blah_cleanup_guard used by foo_with_exceptions2, which takes ownership of the effects of do_blah, and calls cleanup_blah in the destructor unless we call dismiss to indicate that everything is going OK.

Real Examples

That's enough waffling about made up examples, let's look at some real code. Here's something simple: adding a new value to a dynamic array of DataType objects held in a simple dynamic_array class. Let's assume that objects of DataType can somehow fail to be copied: maybe they allocate memory internally, which may therefore fail. We'll also use a really dumb algorithm that reallocates every time a new element is added. This is not for any reason other than it simplifies the code: we don't need to check whether or not reallocation is needed.

If we're using exceptions, that failure will manifest as an exception, and our code looks like this:

class DataType
{
public:
    DataType(const DataType& other);
};

class dynamic_array
{
private:
    class heap_data_holder
    {
        DataType* data;
        unsigned initialized_count;

    public:
        heap_data_holder():
            data(0),initialized_count(0)
        {}
        explicit heap_data_holder(unsigned max_count):
            data((DataType*)malloc(max_count*sizeof(DataType))),
            initialized_count(0)
        {
            if(!data)
            {
                throw std::bad_alloc();
            }
        }
        void append_copy(DataType const& value)
        {
            new (data+initialized_count) DataType(value);
            ++initialized_count; 
        }
        void swap(heap_data_holder& other)
        {
            std::swap(data,other.data);
            std::swap(initialized_count,other.initialized_count);
        }
        unsigned get_count() const
        {
            return initialized_count;
        }
        ~heap_data_holder()
        {
            for(unsigned i=0;i<initialized_count;++i)
            {
                data[i].~DataType();
            }
            free(data);
        }
        DataType& operator[](unsigned index)
        {
            return data[index];
        }
        
    };

    heap_data_holder data;

    // no copying for now
    dynamic_array& operator=(dynamic_array& other);
    dynamic_array(dynamic_array& other);
public:
    dynamic_array()
    {}
    void add_element(DataType const& new_value)
    {
        heap_data_holder new_data(data.get_count()+1);
        for(unsigned i=0;i<data.get_count();++i)
        {
            new_data.append_copy(data[i]);
        }
        new_data.append_copy(new_value);
        new_data.swap(data);
    }
};

On the other, if we can't use exceptions, the code looks like this:

class DataType
{
public:
    DataType(const DataType& other);
    int get_error();
};

class dynamic_array
{
private:
    class heap_data_holder
    {
        DataType* data;
        unsigned initialized_count;
        int error_code;

    public:
        heap_data_holder():
            data(0),initialized_count(0),error_code(0)
        {}
        explicit heap_data_holder(unsigned max_count):
            data((DataType*)malloc(max_count*sizeof(DataType))),
            initialized_count(0),
            error_code(0)
        {
            if(!data)
            {
                error_code=out_of_memory;
            }
        }
        int get_error() const
        {
            return error_code;
        }
        int append_copy(DataType const& value)
        {
            new (data+initialized_count) DataType(value);
            if(data[initialized_count].get_error())
            {
                int const error=data[initialized_count].get_error();
                data[initialized_count].~DataType();
                return error;
            }
            ++initialized_count;
            return 0;
        }
        void swap(heap_data_holder& other)
        {
            std::swap(data,other.data);
            std::swap(initialized_count,other.initialized_count);
        }
        unsigned get_count() const
        {
            return initialized_count;
        }
        ~heap_data_holder()
        {
            for(unsigned i=0;i<initialized_count;++i)
            {
                data[i].~DataType();
            }
            free(data);
        }
        DataType& operator[](unsigned index)
        {
            return data[index];
        }
        
    };

    heap_data_holder data;

    // no copying for now
    dynamic_array& operator=(dynamic_array& other);
    dynamic_array(dynamic_array& other);
public:
    dynamic_array()
    {}
    int add_element(DataType const& new_value)
    {
        heap_data_holder new_data(data.get_count()+1);
        if(new_data.get_error())
            return new_data.get_error();
        for(unsigned i=0;i<data.get_count();++i)
        {
            int const error=new_data.append_copy(data[i]);
            if(error)
                return error;
        }
        int const error=new_data.append_copy(new_value);
        if(error)
            return error;
        new_data.swap(data);
        return 0;
    }
};

It's not too dissimilar, but there's a lot of checks for error codes: add_element has gone from 10 lines to 17, which is almost double, and there's also additional checks in the heap_data_holder class. In my experience, this is typical: if you have to explicitly write error checks at every failure point rather than use exceptions, your code can get quite a lot larger for no gain. Also, the constructor of heap_data_holder can no longer report failure directly: it must store the error code for later retrieval. To my eyes, the exception-based version is a whole lot clearer and more elegant, as well as being shorter: a net gain over the error-code version.

Conclusion

I guess it's a matter of taste, but I find code that uses exceptions is shorter, clearer, and actually has fewer bugs than code that uses error codes. Yes, you have to think about the consequences of an exception, and at which points in the code an exception can be thrown, but you have to do that anyway with error codes, and it's easy to write simple resource management classes to ensure everything is taken care of.

Posted by Anthony Williams
[/ design /] permanent link
Tags: , ,

| Stumble It! stumbleupon logo | Submit to Reddit reddit logo | Submit to DZone dzone logo

Comment on this post

If you liked this post, why not subscribe to the RSS feed RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the right.

1 Comment

Great piece of information! In error-handling, side-effects and safety just donít like each other, yet it is important to write an error free code. I found an article which was really helpful for me in improving my knowledge on error handling - http://mortoray.com/2013/12/05/is-exception-safe-code-truly-possible/

by Rick at 10:23:04 on Thursday, 23 January 2014

Add your comment

Your name:

Your URL:

Email address:

Person or spambot?

Your comment: