How far can you take C++ metaprogramming

To my great surprise I discovered that recently C++ template metaprogramming got insanely powerful, as powerful that you can somewhat easily re-implement e.g. most of python attrs functionality, that is you can easily define types that:

  1. Allow iteration over members;
  2. Allow iteration over values;
  3. Allow getting and setting values by name;
  4. Allow you to attach static metadata to structure members;

In this blogpost series I'll show you how you can implement general-purpose library that converts C++ objects to (and from) json. I don't claim that code will be fully-idiomatic C++ template magic, also it only handles minimal subset of types (but could be easily extended to other types).

In the example I'll show you serialization of simple structure representing orders in a shop.

C++ implementation will rely heavily on boost libraries. To create iterable structures I will rely on Boost::Hana metaprogramming library.

Data objects

Here are DTO objects

#include <cstdint>
#include <vector>

#include <boost/hana.hpp>

namespace dto {

struct OrderedItem {
    BOOST_HANA_DEFINE_STRUCT(
            OrderedItem,
            (std::string, item_name),
            (int64_t, quantity),
            (int64_t, price_cents)
    );
};

struct Order {
    BOOST_HANA_DEFINE_STRUCT(
            Order,
            (std::string, order_no),
            (std::vector<OrderedItem>, items)
    );
};
}

The BOOST_HANA_DEFINE_STRUCT macro defines structure that also allows compile-time iteration over it's elements.

For example OrderedItem would be a structure that looks like that:

struct OrderedItem{
    std::string item_name;
    int64_t quantity;
    int64_t price_cents;

    // + extra hana magic that enables introspection;
}

Iterating over hana structs

To iterate over HANA structs you can hana::for_each template, this template takes a callback that executes for each property. Let's start with something simple --- just printing name and value of each property in a structure.

First let's define callback functor object:

struct PrintProperty{
    std::ostream& out;

    template<typename HanaString, typename T>
    void operator()(HanaString name, T t){
        out << hana::to<const char *>(name) << " " << t << "\n";
    }
};

I have decided to use functor object here, not a lambda function to underline some peculiar fact about hana library --- each property name is it's own type.

Now let's call it:

template<typename T>
void PrintHanaStruct(T t, std::ostream& out){
    hana::for_each(t, hana::fuse(PrintProperty{out}));
}

Now let's define function that converts hana struct to json.

Formatting code

To format JSON I will use Boost.JSON library, so the generic code shown here will just convert DTO objects defined above into boost::json::value objects, that can be easily serialized.

Let's walk through the code, main (and only public) function is FormatObject:

template<typename T>
boost::json::value FormatObject(const T &t) {
    if constexpr (boost::hana::Struct<T>::value) {
        return internal::FormatStructure(t);
    } else {
        return internal::FormatValue(t);
    }
}

This function:

  1. check if T is a hana structure, and if so, formats result using FormatStructure;
  2. otherwise uses generic FormatValue function;

if conxtexpr expression is somewhat tricky (it was introduced in C++17), as it should be evaluated evaluated at compile time, and compiler should only compile actually used branch. The fact that if constexpr can compile only a single branch is important here as FormatValue won't compile for hana structs (also FormatStructure won't compile types that are not hana structs).

Note

You can achieve similar behaviour using std::enable_if from C++11, however if conxtexpr approach seems much more readable to me here.

FormatStructure function is defined as follows:

template<typename T>
boost::json::value FormatStructure(const T &t) {
    boost::json::object result;
    boost::hana::for_each(t, boost::hana::fuse([&result](auto name, auto member) {
        result.emplace(boost::hana::to<const char *>(name), FormatObject(member));
    }));
    return result;
}

which just formats each property of the structure.

FormatValue functions are specialized for: integers, strings and vectors of objects:

template<typename T, typename std::enable_if_t<std::is_arithmetic<T>::value, bool> = true>
boost::json::value FormatValue(const T &t) {
    return t;
}

template<typename T>
boost::json::value FormatValue(const std::vector<T> &t) {
    boost::json::array result;
    for (T value: t) {
        result.emplace_back(FormatObject<T>(value));
    }
    return result;
}

template<typename T>
boost::json::value FormatValue(const std::basic_string<T> &t) {
    return boost::json::string(t);
}

Tests

BOOST_AUTO_TEST_CASE(TestJson)
{
    dto::Order order = {
            .order_no = "2020/12/13/123",
            .items = {
                    {.item_name="Lego Set 2134", .quantity=1, .price_cents = 100 * 100},
                    {.item_name="Lego Set 321", .quantity=3, .price_cents = 15 * 100}
            }
    };

    json::value actual = format::FormatObject(order);
    BOOST_ASSERT(actual.kind() == json::kind::object);
    json::object expected, item1, item2;

    item1["item_name"] = "Lego Set 2134";
    item1["quantity"] = 1;
    item1["price_cents"] = 100 * 100;


    item2["item_name"] = "Lego Set 321";
    item2["quantity"] = 3;
    item2["price_cents"] = 15 * 100;

    expected["order_no"] =  "2020/12/13/123";
    expected["items"] = {item1, item2};

    BOOST_TEST(actual.as_object() == expected);
}

JSON parsing

To handle object parsing I need to have some give some extra requirements to the parsed structures namely: data objects can be default constructed using zero argument constructor, and such objects are in a well-defined state.

Note

This code aims to be concise and clear, not necessarily safe to run on untrusted input.

This is not a production ready code.

Note

I made a decision for this parser to be a non-validation one, that is: it tries to parse the object as much as possible ignoring errors. I think that error handling and validation should be don in another layer of the application.

Code is slightly more complex as well. Let's start with pubic API:

template<typename T>
bool ParseObject(const json::value &value, T &result) {
    if constexpr (boost::hana::Struct<T>::value) {
        return internal::ParseStructure(value, result);
    } else {
        return internal::ParseValue(value, result);
    }
}

All functions in the parsing module will use the similar convention:

  1. They return whether they parsed field successfully (currently result is ignored).
  2. value parameter is value to be parsed;
  3. result parameter is a object that value will be parsed into.

As before main function just does introspection checking if object is a hana structure and calling appropriate helper.

ParseStructure looks like that:

template<typename T>
bool ParseStructure(const json::value &value, T &output) {
    if (!value.is_object()) {
        return false;
    }
    const json::object &object = value.as_object();

    boost::hana::for_each(boost::hana::keys(output), [&output, &object](auto key) {
        const auto it = object.find(boost::hana::to<const char *>(key));
        if (it != object.end()) {
            ParseObject(it->value(), boost::hana::at_key(output, key));
        }
    });
    return true;
}

First check if json::value represents an json object, and if not so just abort early. Then for each field in the structure:

  1. Find read appropriate property from the json object;
  2. Recursively call ParseObject using json object and appropriate field from from the structure.

Now let's see how ParseValue works for std::vector fields:

template<typename T>
bool ParseValue(const json::value &value, std::vector<T> &output) {
    if (!value.is_array()) {
        return false;
    }
    output.clear();
    for (const json::value &item : value.as_array()) {
        output.emplace_back();
        ParseObject(item, output.back());
    }
    return true;
}

This function:

  1. Aborts early if json value is not an array;
  2. For each entry in the array creates new object in the vector;
  3. Parses json into newly created object;

Here is ParseValue implemented for strings:

template<typename T>
bool ParseValue(const json::value &value, std::basic_string<T> &output) {
    if (!value.is_string()) {
        return false;
    }
    output = json::value_to<std::string>(value);
    return true;
}

The above template handles strings. I'm not sure if it would handle unicode well.

And last but not least: ParseValue for arithmetic types:

template<typename T, typename std::enable_if_t<std::is_arithmetic<T>::value, bool> = true>
bool ParseValue(const json::value &value, T &output) {
    if (!is_number(value)) {
        return false;
    }
    try {
        output = json::value_to<T>(value);
    } catch (boost::wrapexcept<boost::system::system_error>&){
        return false;
    }
    return true;
}