Python shelve

Takeaways

Python has a super-simple, reasonably performant, on disk key-value store, available in the shelve module.

To use just:

import shelve

shelve_dict = shelve.open("/myapplication/db-file.db")

After this call shelve_dict is a python mutable mapping (think of it as a dictionary).

Please remember to shelve_dict.close() after use!

My benchmark

I needed to add intermediate result cache for a long-running process. Each object I needed to cache has a unique identifier, so key-value store was perfect. I needed to cache about 50.000 items 1mb in size each.

Note

This is a strictly single-threaded process running on a local machine, so using Memcached and/or Redis would be an overkill --- not to mention that using 50Gbs of RAM for this would not be cost-effective.

There is a nice shelve package in Python standard library which implements a key-value store on top of (either) gdbm or ndbm (two on-disk hash table implementations)

I had a vague recollection of having performance problems with shelve, and I didn't find any recent benchmarks of this module (or benchmarks that fit my use case) so I decided to do benchmark myself. Moreover while 50GB of total database size is not much, it is not a trivial amount, so I wanted to ensure that I won't run into performance problems on production.

Note

Beforementioned performance problems happened years ago, in a totally different application (memoization for some high energy physics computation) and I might have misused the library.

Long story short: it is totally fast enough, with plenty to spare.

Benchmark

Machine and OS:

  • Recent Debian
  • Ryzen CPU
  • PCI-E m2 SSD
  • Encrypted (dm-crypt) btrfs
  • 32 GB of RAM

Machine setup:

  1. Disabled COW on test data file (btrfs defaults to copy on write, which is not very fast for big files);

  2. Set disk cache to 1 second (one second after there first dirty page all pages are written to disk):

    echo 100 > /proc/sys/vm/dirty_writeback_centisecs
    echo 100 > /proc/sys/vm/dirty_expire_centisecs
    
  3. After "write part" i cleaned disk cache:

    echo 3 > /proc/sys/vm/drop_caches
    

Python:

  • Ran on python 3.7.1
  • Using ipython notebook

Setup code:

import os, sys, base64, datetime, random, pickle
import shelve

test_data = shelve.open("/tmp/test-data")
# Create 100 1mb items to save (generating 50k of these takes a lot of time
# and would blur the results)
items = [os.urandom(1024 * 1024) for ii in range(100)]

Write part of benchmark:

ENTRIES = 50_000
keys = []

with shelve.open(
        "/tmp/test-data", protocol=pickle.HIGHEST_PROTOCOL, writeback=False
) as test_data:

    for ii in range(ENTRIES):
        if ii % 250 == 0:
            print(ii)
            # Sync data so we don't measure RAM performance.
            test_data.sync()
        key = base64.b64encode(os.urandom(16)).decode('ascii')
        keys.append(key)
        test_data[key] = items[ii % len(items)]

Read part of benchmark:

random.shuffle(keys)

with shelve.open("/tmp/test-foo/db") as test_data:

    for key in keys:
        value = test_data[key]
        # decode step is just to make 100% sure that data is actually read
        # from disk.
        value.decode('ascii', errors='ignore')

Results:

  1. Write part of the benchmark took about 5 minutes, which is about 150 entries per second, which is about 150mb per second. Which is way faster than I needed.

    This is also consistent with iotop results, which printed about 200-300 M/s disk write during the test.

  2. Read part took about 15 min (which is still way faster than I need).

The linux kernel used about 50% of available RAM as a buffer which is more than is available for this process in the production system), so these results will not totally representative --- however since in my case entry will be added to cache every couple of seconds, cache overhead will nevertheless be negligible.