Sunday, March 13, 2011

Mysteriously beautiful: using unicorns to manage Python server processes

When it comes to deploying Python servers (specifically web servers), there are no shortage of choices. Almost everyone agrees that WSGI is the right way to go, but the choice of a good container is less clear. For many projects, Apache + mod_wsgi is a fine choice that covers many web application deployment scenarios.

The situation becomes more complicated the moment real-time web application and asynchronous frameworks come to play. The ideal way to utilize multiple cores available on a modern server with Python is to use multiple processes. Some frameworks such as the excellent Tornado web framework recommend running one process per core, and using a reverse proxy such as nginx or haproxy to manage the process pool. This deployment scenario has in fact turned out to be very practical, with one exception: managing the process pool is not the most pleasant process. One possible approach is to use a process monitor such as supervisord, while others would just use the pre-forking capabilities built in to the framework.

Regardless of the nature of your app, pre-forking is probably your best choice; however, not all pre-forking applications are equal. If you're working on an application that is constantly being hammered at around 1000 requests / second, then you'd want a more reliable process management tool. gunicorn (green unicorn) fits the bill. It's extremely simple to setup, and configure. Most of the configuration is handled in a simple Python file and it's support for various containers, including async frameworks such as gevent, or Tornado, is excellent.

One of the most intriguing and useful options it offers is the ability to specify the number of requests that a worker process can serve before being "retired" in favour of a new worker process. In the gunicorn model, a master process is launched, whose sole purpose is to manage the child worker processes. Just as humans need to take a break after a long period of intense work, so do Python processes. Although it seems a bit "hacky", it's not an idea without merit. As the world is not perfect, neither is Python, nor is memory management, and let's be honest, nor is your code!

You can enable the auto-restart behaviour by settings the max_requests configuration parameter. Furthermore, if a worker process happens to die for any other reason, unicorns come to the rescue and make sure n workers are running as specified by the workers configuration parameter.
It's worth giving these mythical creatures a chance!

No comments:

Post a Comment