Converting Python attrs types to JSON

In this post I will show you very simple code that converts arbitrary attrs objects to and from json (you can easily convert it to handle dataclass objects).

Note

Ultimate goal for me is to show you how to do the same thing in C++, but I need a reference implementation.

Note

While this code should be rather correct, it is nowhere near polished enough to handle untrusted data. This is not a production-ready code.

Example dataclasses

Here are dataclasses I will serialize, however code should work for other attrs types.

import typing
import attr


@attr.dataclass(eq=True, hash=True)
class OrderItem:
    item_name: str
    quantity: int
    price_cents: int


@attr.dataclass(eq=True, hash=True)
class Order:
    order_no: str
    items: typing.List[OrderItem]

If you are not familiar with attrs or dataclass modules here is a crash course:

Decorator @attr.dataclass() marks type as a attr dataclass. Inside of the class you define properties generated class will have. When defining each field you also define field type (these fields are not checked at runtime, but can be checked at statically by mypy).

I have also used a typing annotation in the above: typing.List[OrderItem] means that items is a list of OrderItem instances.

attrs will by default generate for you constructor and repr method, I have also asked attrs to generate == operator and __hash__ method.

Formatting code

Now let's format defined above dto classes to json:

import attr

def to_json(item) -> dict:
    result = {}
    for field in attr.fields(type(item)):
        value = getattr(item, field.name, None)
        if isinstance(value, (list, tuple)):
            result[field.name] = [to_json(elem) for elem in value]
        elif isinstance(value, dict):
            result[field.name] = {
                field_name: to_json(field_value)
                for (field_name, field_value) in value.values()
            }
        elif attr.has(value):
            value = to_json(value)
            result[field.name] = value
        else:
            result[field.name] = value
    return result

Code works as follows:

  1. Iterates over item fields;
  2. If field value is a list executes recursively for all elements;
  3. If field value is attrs class executes recursively on it;
  4. All other types pass unchanged;

Tests

def test_serialization():
    order = Order(
        order_no="2020/12/13/123",
        items=[
            OrderItem("Lego Set 2134", quantity=1, price_cents=100 * 100),
            OrderItem("Lego Set 321", quantity=3, price_cents=15 * 100),
        ],
    )
    assert to_json(order) == {
        "items": [
            {"item_name": "Lego Set 2134", "price_cents": 10000, "quantity": 1},
            {"item_name": "Lego Set 321", "price_cents": 1500, "quantity": 3},
        ],
        "order_no": "2020/12/13/123",
    }

Parsing code

Note

This code aims to be concise and clear, not necessarily safe to run on untrusted input.

This is not a production ready code.

This code is somewhat more complex than formatting code, so let's walk through it. First helper methods:

class JsonParseException(Exception):
    pass

def is_list(annotation):
    """
    Returns True if passed annotation is of typing.List
    """
    try:
        return annotation.__origin__ == list
    except AttributeError:
        return False


def get_list_type_param(annotation):
    """
    Extract typing parameter from typing annotation.

    >>> get_list_type_param(typing.List[str]) == str
    True
    """
    return annotation.__args__[0]

And now conversion function:

def from_json(data: typing.Any, typing_annotation: typing.Any):
    if typing_annotation == int:
        return int(data)
    elif typing_annotation == float:
        return float(data)
    elif typing_annotation == str:
        return str(data)
    elif is_list(typing_annotation):
        return tuple(
            from_json(item, get_list_type_param(typing_annotation)) for item in data
        )
    elif attr.has(typing_annotation):
        converted = {}
        for field in attr.fields(typing_annotation):
            if field.name in data:
                converted[field.name] = from_json(data[field.name], field.type)
        return typing_annotation(**converted)
    else:
        raise JsonParseException(f"Can't handle {typing_annotation}")

This time we need to know what type we are parsing up front, so this function accepts two arguments: data to parsed, and typing_annotation that represents type to be parsed.

The function contains a large if statement that checks the type and then converts parsed json data to appropriate type.

The tricky part for me was interacting with typing module, namely:

  • Checking if typing annotation is a list;
  • Extracting arguments from annotations;

To do the above I have used private API, that works for python 3.8 (I wouldn't be surprised if this only version my helpers work for).