Repeatable builds in python

Repeatable builds in python are, ... challenging. Repeatable build can mean many things, usually it means that output of your builds do not change in time (e.g. if you don't pin all dependencies you might end up shipping new version of a library when it is released).

If you don't have repeatable builds you might wake up and notice that all branches in your repository fail, and you need to manually apply the same patch to each one of them.

Generally repeatable builds are a hard problem, golden standard for it is Java and Maven, in Maven projects you usually expect to be able to come back to a project, after a year or two, and it will build.

In python it's not that easy, as:

  • Dependencies don't usually pin versions of their own dependencies;
  • Your build script setup.py is written in python, and might become incompatible with future version of Python and or pip.

Techniques for repeatable builds

  • When you want repeatable build use frozen requirements.
  • Keep your setup.py super simple without any fancy nonstandard code.
    • Don't import modules from pip.
    • It's better to write install_require by hand than generate it from requirements.

Why you need frozen requirements

Frozen requirements contain fully specified versions for all packages in your virtualenv, not only your dependencies but also dependencies of your dependencies.

To get repeatable builds you'll need frozen requirements, lets consider following scenario:

  • You have pinned versions of all your immediate dependencies.
  • Your dependency named A=1.2.12 depends on package B>1.2.
  • Publisher of B publishes new version that breaks compatibility.

How to create frozen requirements with pip-tools

For a long time I used pip-tools which introduces following workflow:

  • You create requirements.in files that, for the most part, contain just dependency names.
  • Then pip-tools generates requirements.txt that contain fully pinned versions for all packages.

For example your requirements.in may contain:

Flask

Then you: pip-compile requirements.in

#
# This file is autogenerated by pip-compile
# To update, run:
#
#    pip-compile --output-file requirements.txt requirements.in
#
click==6.7                # via flask
flask==0.12.2
itsdangerous==0.24        # via flask
jinja2==2.9.6             # via flask
markupsafe==1.0           # via jinja2
werkzeug==0.12.2          # via flask

Why to choose pipenv:

  • It needs not to be installed during build time (especially nice in Docker)
  • I have used it for about two years, and I did not register any problems with it.

How to create frozen requirements with pipenv

Pipenv uses much more fancy format, and allows you just to call pipenv install Flask, which will create both Pipfile (that contain unpinned dependencies --- like requirements.in) and Pipfile.lock ( which specifies versions of everything).

Why to choose pipenv:

  • Very nice UX
  • Manages virtualenvs (if you want it)

How to create setup.py script for repeatable builds

  • Keep it simple. What is not there can't break;
  • Don't import things from pip package (despite what various SO post tell you);

To quote from pip manual:

> As noted previously, pip is a command line program. > While it is implemented in Python, and so is available from your Python > code via import pip, you must not use pip's internal APIs in this way.