Saturday, April 11, 2015

Python, the web, and snake oil - part 2

While my previous post was cathartic to write, it was not useful. In the hours that followed I became aware of others who share those same feelings. Through some online conversation I found a few very good solutions, further distilled my thoughts, and found some great resources that deserve to be shared.

First, I would strongly urge everyone working in, around or near a Python application to watch this talk. All of it. It just keeps getting better and more specific the deeper Glyph gets into it. While given at Djangocon as a keynote, it is applicable broadly.

Watching this and speaking briefly with Glyph helped me distill my thoughts.

Your Python application should be a Python application, not a plugin for a web server.


Your web server should be something you can import. Your application should not be something imported by a web server. This is an important distinction. The difference here is familiar; it is the distinction between framework and library. Having a web server import your application turns it into a peg that must be properly shaped to fit into it's corresponding hole. Over time, the effects of various third-party libraries (e.g. something importing lxml) become harder to control and predict in relation to the peg's shape. Flip this over and force the web server to be a properly behaved unit of Python which may be used like any other unit of Python: imported, tested, etc.

Developers, this will demystify deployment. The magic that happens in production will suddenly be attainable inside your development virtual environment. There will be fewer (or no) surprises. Rather than fighting with some strange piece of software written in C, you will be doing what you've always been doing: installing a dependency and using it.

SREs, this will help you get out the door at 5PM and maybe sleep through a few more nights. What developers do locally in development will work in production. Re-read that a few times. This is the sad current state of affairs in so many deployment scenarios, and we've all sat back and accepted it! How many times have you issued a rollback because production and development behave completely different? It won't solve every single one of these issues, but it will help enough that it warrants attention. By allowing the development environment to closely parallel the production environment developers will be solving production problems for you, before making it into production and wreaking havoc.

There are several WSGI containers to choose from that are well-behaved Python modules.


There are several, including cherrypy, and twisted.web. I am currently swooning over twisted web's WSGI container. Now sure, I said above that your web server should be something you can import, and the docs for these show examples of running a WSGI application in a manner that is slightly different. However, these (and some others) WSGI containers are well-behaved Python applications backed by well-behaved (and directly usable) Python packages. You can write your own script that imports the WSGI container and starts serving your application. When push comes to shove, you can treat the container like any other library, like real Python. There's no mystical loader machinery to work around. Want to know what twistd is doing when you tell it to run your app? It's right here, in Python.

You will have to do a little bit of work, and you will have to understand what the web server is doing.


And that's a really good thing. You should know what your web server is doing. Application developers may have to look at some documentation or code for a few minutes before properly initializing a WSGI container and serving it. Someone will have to take the time to write something slightly more sophisticated than app.run() in your flask app, but it will only take a few lines and a few minutes to do so, and then you are developing on production infrastructure.

On the SRE side, there may be slightly more work as well. You might have to use monit or supervisord to run a process for each core. But this means you are explicitly in control of the process model of the web server. Rather than let declarative configuration options rigidly choose between a handful of ways to manage processes, you use a battle-tested tool you are comfortable with to precisely control the process model of the web application.

The entry-point into the application can be made to be the exact same whether I am running a development server on my laptop or behind a load-balancer in production. This will eliminate a whole class of unknowns.

This little bit of work is up-front and one-time only. As the saying goes, an ounce of prevention is worth a pound of cure.

No comments:

Post a Comment