Strongly typed aliases

I’ve been programming in C++ for over 20 years and I’ve used the language in all kinds of problem domains; it’s not perfect by any means, but it is thriving and evolving thanks to the efforts of many great people who invest a lot of their time time and energy.

One of the biggest problems for me, is the type-unsafe nature of typedef because rather than defining a new type it defines a transparent type alias. The name is just a programmer convenience and carries no semantic meaning to compiler; as far as the compiler is concerned the alias is indistinguishable from the underlying type. As a result, using type aliases sacrifices the strong type safety the C++ affords us, often with disastrous consequences:

#include <iostream>
    
using RBC_COUNT = unsigned int;
using WBC_COUNT = unsigned int;

int main(int, char**)
{
    RBC_COUNT rbc;
    WBC_COUNT wbc;

    std::cout << "What is the patient's Red Blood Cell Count? ";
    std::cin << rbc;

    std::cout << "What is the patient's White Blood Cell Count? ";
    std::cin << wbc;

    //... some other code

    if (!normalWBC(rbc))
        std::cout << "Admit patient to the hospital immediately!\n";
    else
        std::cout << "Prescribe two salt tablets and send patient home!\n";
    
    return 0;
}

Here the programmer unintentionally passed the variable representing the count of red blood cells to a function expecting the count of white blood cells. This simple mistake could cost a patient her life. If the types RBC_COUNT and WBC_COUNT were distinct, the compiler would have detected the type mismatch at compile time and flagged an error.

While this example is contrived it does serve to illustrate a point: that the absence of a “strong” type alias is a very real problem with practical consequences.

Haven’t I heard this before?

If you’re a C++ programmer, then you probably have. I’m not the first person to bring this issue up. I’m not even the first person to blog about it. This is, in the words of Dr. Walter E. Brown, a “feature oft requested for C++”. His excellent treatise on this subject, P0109R0, includes not only references to historical discussions about the subject but several examples.

So it’s solved, right?

Not really. Dr. Brown’s proposal was never accepted into the core language, which means the language is still lacking strongly typed aliases despite an obvious need for such a feature, as evidenced by the repeated requests for it by programmers and the various proposals on this topic.

To be sure, it’s possible to actually implement a solution using tools and techniques available today, although the solutions are suboptimal. P0109R0 lists several attempts and I’m sure there’s others. But typically all solutions rely either on creating a small class or struct that wraps the common type or on templetizing the common type and using a tag template argument.

The “common base” solution

The small class solution can be implemented in one of two ways: derivation or encapsulation. Both cases have unique advantages and disadvantages.

The derivation solution

While neat and relatively simple, the derivation solution only works when you are encapsulating a class or struct not with native types like int or std::uint64_t. Here we’d have:

// A class that provides an unsigned int 128 bits of precision 
class uint128_t
{ 
private:
    ...

protected:
    ...

public:
    ...
};

// A class that represents a UUID
class uuid_t
    : public uint128_t
{
    ... typically empty (or just boilerplate)
};

// A class that represents an MD5 digest
class md5digest_t
    : public uint128_t
{
    ... unless empty, full of the same boilerplate
};

With this solution, unless the derived classes are empty, then the risk of slicing rears its ugly head.

The encapsulation solution

Encapsulation works with any type, but suffers from a number of disadvantages: the wrapper type will need to implement forwarding functions to any interfaces that the type being wrapped exposes and need to be exposed. This can be a maintenance nightmare: a single change in the common type may have to be replicated across every wrapper.

class uint128_t
{ 
private:
    ...

protected:
    ...

public:
    void foo()
    {
        ...
    }
    
    void bar()
    {
        ...
    }
};

class uuid_t
{
private:
    uint128_t v_;
    
public:
    ... boilerplate
    
    void foo()
    {
        v_.foo();
    }

    // oops, forgot bar(). Is that an accident? Or was it on purpose?
};

class md5digest_t
    : public uint128_t
{
private:
    uint128_t v_;
    
public:
    ... boilerplate
    
    void foo()
    {
        v_.foo();
    }
    
    void bar()
    {
        v_.bar();
    }
}; 

The tag-based solution

The second solution&emdash;turning the common type into a template&emdash;can alleviate some of these issues. It is relatively simple to implement and doesn’t require one be a template metaprogramming expert (but it doesn’t hurt!). But a tag-based solution isn’t without faults. The types have no common base, so they are truly non-interoperable without custom logic in the template. And even with special logic, copying may be unavoidable. Again, not ideal.

Code which can operate on an instance regardless of the tag, must itself be templated, potentially leading to code bloat and increased compile times as the compiler has to process, generate, compile and link more code. By way of example, imagine uint128_t T> where T is the tag something like using uuid_t = uint128_t<uuid_tag> and using md5digest_t = uint128_t<md5digest_tag>. A to_string function for uint128 could look like this:

template <typename Tag>
std::string to_string(uint128<Tag> const& t)
{
    ... code
}

What we need

It’s simple: we need strongly-typed aliases. There’s precedent for adding such a thing: C++11 added enum class and elevated enum types into first-class citizens in the type system allowing the compiler to prevent accidental intermixing of types.

P0109R0 outlined one possible solution, although I find it overly flexible and/or ambitious. Certainly if we’re going to add a feature to the language, we should try to make it as flexible as possible but I don’t know that we need that much flexibility for this, especially when something much simpler will get us many, perhaps even most, of the benefits.

Extending the alias declaration syntax along the lines of enum class, a strongly typed alias could be simple and as intuitive:

using class uuid_t = uint128_t;
using class md5digest_t = uint128_t;

This syntax adds no new keywords to the language (it reuses the class keyword in a context it was it was previously not valid) and will not break or change the semantics of any currently valid C++ code.

Toe-may-toe / Toh-mah-toh

Ultimately, even thought uuid_t and a md5digest_t are distinct types from the type-system’s perspective, underneath their fancy exterior they are, ultimately, just a uint128_t. This means that we can allow the programmer to “bend” the type system when necessary allowing us to efficiently avoid “copying” when there’s a need to convert a strongly typed alias to its underlying type: since the types are, under the hood, identical, it’s possible to easily convert the alias to the actual, underlying type, using an existing tool that’s perfectly suited for this purpose: static_cast.

void to_string(uint128_t const& t);
md5digest_t x = md5("Hello, World");
std::cout << to_string(static_cast<uint128_t const&>(x));

Additionally, it’s possible for programmers to explicitly “break” the type system and move “sideways” in the type hierarchy, when such a thing is needed, with an appopriately complex (and thus discouraging) syntax:

void have_uuid(uuid_t const& t);

bool test1(md5digest_t const& x)
{
    // Zero runtime overhead and type safety! Can't
    // cast a `uuid_t const&` to a `uint256_t const&`
    return have_uuid(
        static_cast<uuid_t const&>(
            static_cast<uint128_t const&>(x)));
}

bool test2(md5digest_t const& x)
{
    return have_uuid(reinterpret_cast<uuid_t const&>(x);
}

Making templetized strongly typed aliases is easy too and leverages the same syntax. Consider, for example:

template< 
    class CharT, 
    class Traits = std::char_traits<CharT> 
>
using class c_format_string = std::string<CharT, Traits>;

template< 
    class CharT, 
    class Traits = std::char_traits<CharT> 
>
using class sql_format_string = std::string<CharT, Traits>;

The programmer could then define functions that “sanity check” a format string with protocol specific rules and the compiler’s usual overload resolution would discover the correct call:

bool check_format_string(sql_format_string const& fmt)
{
    return false; // SQL format string are always unsafe. Ask Little Bobby Tables.
}

bool check_format_string(c_format_string const& fmt)
{
    return false; // C format string are no better!
}

How do we do it?

One of the best features of C++ is type safety, and strong type aliases will go a long way to helping programmers be more disciplined and communicate semantics to the compiler and allowing compilers to be more helpful when compiling by understanding the intended semantics and catching type errors at compile time.

While this is probably not the best way to get a paper before the C++ committee, putting it out there is a good start to me writing that paper. Oh, and it just feels good to blog again, because writers’ block was driving me crazy.