reStructuredText Viewer

This week's project was implementing my own reStructuredText Viewer and editor for reStructuredText documents. In the past I have used the Online reStructuredText editor for quickly previewing and editing README files before being publishing on the Python Package Index. While I have been happy with the online editor, it recently had some stability issues. When I tried to deploy my own instance I was surprised to find a dependency on Redis.

Not wanting to deploy an instance of Redis as well as a web application, I decided it was a good opportunity to learn something about docutils. It's the basis for the Sphinx documentation generator, I tool I use regularity. My viewer is a web application built using the Flask web framework. The application acts as a docutils publisher, accepting reStructuredText documents and return rendered HTML. As with other little web applications I've, the viewer is deployed on Heroku. The source code is available from GitHub.

So feel free to give my reStructuredText Viewer a try. But please be kind to the aesthetics, web design is not my strong suite.


What not to do with ZeroMQ

ZeroMQ is a great networking library, and the PyZMQ package makes that greatness accessible from Python. This week however, I encountered an implementation pattern that is incompatible with ZeroMQ.

For "reasons", I had wanted to use ZeroMQ inside a process in a way that was blind to process forks. Unfortunately, if a child interacts with a ZeroMQ context inherited from its parent in anyway, including attempting to close it, ZeroMQ will likely terminate with an assertion failure. Compounding this, not being able to close the context means leaking file descriptors. The worst case scenario is a child that does some work then forks, the parent exits while the child repeats the sequence.

Here is a silly example that exercises a worst case.

import os
import sys
import zmq

ADDRESS = 'tcp://127.0.0.1:5555'
MAX_FORKS = 4096

# the original parent creates a zeromq context and socket
ctx = zmq.Context()
sock = ctx.socket(zmq.PULL)
sock.bind(ADDRESS)

forked = 0
if os.fork() != 0:
    # this parent listens for messages from its children
    while forked < MAX_FORKS:
        forked = sock.recv_json()
        sys.stdout.write('.')
        sys.stdout.flush()
    sys.exit(0)
else:
    # the children discard the parent's context,
    # open a zeromq socket and send a message
    # to the first parent
    while forked < MAX_FORKS:
        forked += 1
        del ctx, sock
        ctx = zmq.Context()
        sock = ctx.socket(zmq.PUSH)
        sock.connect(ADDRESS)
        sock.send_json(forked)
        # after send a message, this child exits
        # spawning a new child to send a new message
        if os.fork() != 0:
            sys.exit(0)

Honestly this is a terrible design when using ZeroMQ. As evidenced by its inevitable failure from running out of file descriptors. Sadly, I also have a valid reason for wanting to be robust with this design. So my choices include:

  1. Do not support network in children. (A valid but regrettable limitation)
  2. Apply a different networking solution. (Yak shaving)

Even if an atfork module that mirrored the existing atexit was added to Python, it may not resolve this issue for me. The proposed atfork implementation is Python only. When using CPython it would be oblivious to any forks from inside C extensions.

ZeroMQ is still great. It's just not intended to work in this situation.