## Ghosts of Frameworks Past

By anders pearson 25 Nov 2017

Recently, Github added the ability to archive repositories. That prompted me to dig up some code that I wrote long ago. Stuff that really shouldn’t be used anymore but that I’m still proud of. In general, this got me reminiscing about old projects and I thought I’d take a moment to talk about a couple of them here, which are now archived on Github. Both are web frameworks that I wrote in Python and represent some key points in my personal programming history.

I started writing Python in about 2003, after spending the previous five years or so working mostly in Perl. Perl was an important language for me and served me well, but around that point in time, it felt like every interesting new project I saw was written in Python. I started using Python on some non web-based applications and immediately liked it.

Back then, many smaller dynamic sites on the web were still using CGI. Big sites used Java but it was heavy-weight and slow to develop in. MS shops used ASP. PHP was starting to gain some popularity and there were a ton of other options like Cold Fusion that had their own niches. Perl CGI scripts running on shared Linux hosts were still super popular. With Perl, if you needed a little more performance than CGI scripts offered, if you ran your own Apache instance, you could install mod_perl and see a pretty nice boost (along with some other benefits). Once you outgrew simple guestbooks and form submission CGI scripts, you needed a bit more structure to your app. Perl had a number of templating libraries, rudimentary ORMs, and routing libraries. Everyone generally picked their favorites and put together a basic framework. Personally, I used CGI::Application, Class::DBI, and HTML::Template. It was nowhere near as nice as frameworks that would come later like Ruby on Rails or Django, but at the time, it felt pretty slick. I could develop quickly and keep the code pretty well structured with cleanly separated models, views, and templates.

Python wasn’t really big in the web world yet. There was Zope, which had actually been around for quite a while and had proven itself to be quite capable. Zope was… different though. Like, alien technology different. It included an object database and basically took a completely different approach to solving pretty much every problem. Later, I would spend quite a bit of time with Zope and Plone, and I have a great deal of respect for it, but as a newcomer to Python, it was about ten steps too far.

So, in the meantime, I did the really predictable programmer thing and just ported the stuff I missed from Perl to Python so I could write applications the same way, but in a language that I had grown to prefer.

Someone else had already made a port of HTML::Template, htmltmpl. It worked, but I did have to make some fixes to get it working the way I liked.

Similarly, Ian Bicking had written SQLObject, which was a very capable ORM for the time.

My major undertaking was cgi_app, which was a port of CGI::Application, routing HTTP requests to methods on a Python object, handling the details of rendering templates, and providing a reasonable interface to HTTP parameters and headers.

It looks pretty basic compared to anything modern, and there are some clear PEP8 violations, but I remember it actually being pretty straightforward and productive to work with. I could write something like

#!/usr/bin/python

from cgi_app import CGI_Application

class Example_App(CGI_Application):
def setup(self):
self.start_mode = 'welcome'

def welcome(self):
return self.template("welcome.tmpl",{})

def param_example_form(self):
return self.template("param_example_form.tmpl",{})

def param_example(self):
name = self.param('name')
return self.template("param_example.tmpl",{'name' : name})

if __name__ == "__main__":
e = Example_App()
e.run()

drop that file in a cgi-app directory along with some templates and have a working app. It was good enough that I could use it for work and I built more than a few sites with it (including at least one version of this site).

I quickly added support for other template libraries and mod_python, the Python equivalent to mod_perl.

I don’t remember ever really publicising it beyond an announcement here and I didn’t think anyone else was really using it. Shockingly recently though, I got an email offering me some consulting work for a company that had built their product on it, were still using it, and wanted to make some changes. That was a weird combination of pride and horror…

By around 2006 or 2007, the landscape had changed dramatically. Ruby on Rails 1.0 was released in 2005 and turned everything on its head. Other Python developers like me had their own little frameworks and were writing code but there wasn’t really any consensus. Django, TurboGears, web.py, and others all appeared and had their fans, but none were immediately superior to the others and developer resources and mindshare were split between them. In the meantime, RoR was getting all the press and developer attention. It was a period of frustration for the Python community.

Out of that confusion and frustration came something pretty neat though. A few years earlier, PJE had come up with WSGI, the Web Server Gateway Interface, but it hadn’t really taken off. WSGI was a very simple standard for allowing different web servers and applications to all communicate through a common interface. At some point the Python community figured out that standardizing on WSGI would allow for better integration, more portability, simpler deployment, and allowed for chaining “middleware” components together in clever ways. Most of the competing frameworks quickly added WSGI support. Django took a little longer but eventually came around. The idea was so good that it eventually was adopted by other languages (eg, Rack for Ruby).

JSON had mostly pushed out XML for data serialization and web developers were beginning to appreciate RESTful architecture. I began thinking about my own applications in terms of small, extremely focused REST components that could be stitched together into a full application. I called this approach “microapps” and was fairly evangelical about it, even buying ‘microapps.org’ and setting it up as a site full of resources (I eventually let that domain lapse and it was bought by spammers so don’t try going there now). Other smart folks like Ian Bicking were thinking in similar ways it seemed like we were really onto something. Maybe we were. I don’t really want to take any credit for the whole “microservices” thing that’s popular now, but I do feel like I see echoes of the same conversations happening again (and we were just rehashing SOA and its predecessors; everything comes back around).

Eventually, after writing quite a few of these (mostly in TurboGears), almost always backed by a fairly simple model in a relational database, I began to tire of implementing the same sort of glue logic, mapping HTTP verbs to basic CRUD operations in the database. So I wrote a little framework. This was Bourbon (WSGI can be pronounced “whiskey”).

Bourbon let me essentially define a single database table (using SQLALchemy, which was the new cool Python ORM) and expose a REST+JSON endpoint for it (GET to SELECT, POST or PUT to INSERT, PUT to UPDATE, and DELETE to DELETE) with absolutely minimal boilerplate. There were hooks where you could add additional functionality, but you got all of that pretty much out of the box by defining your schema and URL patterns.

For a number of reasons, I mostly switched to Django shortly after that, so I never went too far with it, but it was simple and clear and super useful for prototyping.

In recent years, I’ve built a few apps using Django Rest Framework and it has a similar feel (just way more complete).

I don’t have a real point here. I just felt like reminiscing about some old code. I’ve managed to resist writing a web framework for the last decade, so at least I’m improving.

## Late to the Party

By anders pearson 30 Sep 2017

A little behind the curve on this one (hey, I’ve been busy), but as of today, every Python app that I run is now on Python3. I’m sure there’s still Python2 code running here and there which I will replace when I notice it, but basically everything is upgraded now.

## Samlare: expvar to Graphite

By anders pearson 24 Sep 2017

One of my favorite bits of the Go standard library is expvar. If you’re writing services in Go you are probably already familiar with it. If not, you should fix that.

Expvar just makes it dead simple to expose variables in your program via an easily scraped HTTP and JSON endpoint. By default, it exposes some basic info about the runtime memory usage (allocations, frees, GC stats, etc) but also allows you to easily expose anything else you like in a standardized way.

This makes it easy to pull that data into a system like Prometheus (via the expvarCollector) or watch it in realtime with expvarmon.

I still use Graphite for a lot of my metrics collection and monitoring though and noticed a lack of simple tools for geting expvar metrics into Graphite. Peter Bourgon has a Get to Graphite library which is pretty nice but it requires that you add the code to your applications, which isn’t always ideal.

I just wanted a simple service that would poll a number of expvar endpoints and submit the results to Graphite.

So I made samlare to do just that.

You just run:

$samlare -config=/path/to/config.toml With a straightforward TOML config file that would look something like: CarbonHost = "graphite.example.com" CarbonPort = 2003 CheckInterval = 60000 Timeout = 3000 [[endpoint]] URL = "http://localhost:14001/debug/vars" Prefix = "apps.app1" [[endpoint]] URL = "http://localhost:14002/debug/vars" Prefix = "apps.app2" FailureMetric = "apps.app2.failure" That just tells samlare where your carbon server (the part of Graphite that accepts metrics) lives, how often to poll your endpoints (in ms), and how long to wait on them before timing out (in ms, again). Then you specify as many endpoints as you want. Each is just the URL to hit and what prefix to give the scraped metrics in Graphite. If you specify a FailureMetric, samlare will submit a 1 for that metric if polling that endpoint fails or times out. There are more options as well for renaming metrics, ignoring metrics, etc, that are described in the README, but that’s the gist of it. Anyone else who is using Graphite and expvar has probably cobbled something similar together for their purposes, but this has been working quite well for me, so I thought I’d share. As a bonus, I also have a package, django-expvar that makes it easy to expose an expvar compatible endpoint in a Django app (which samlare will happily poll). ## 2016 Music By anders pearson 01 Jan 2017 Once again, I can’t be bothered to list my top albums of the year, but here’s a massive list of all the music that I liked this year (that I could find on Bandcamp). No attempt at ranking, no commentary, just a firehose of weird, dark music that I think is worth checking out. ## Using Make with Django By anders pearson 03 Mar 2016 I would like to share how I use a venerable old technology, GNU Make, to manage the common tasks associated with a modern Django project. As you’ll see, I actually use Make for much more than just Django and that’s one of the big advantages that keeps pulling me back to it. Having the same tools available on Go, node.js, Erlang, or C projects is handy for someone like me who frequently switches between them. My examples here will be Django related, but shouldn’t be hard to adapt to other kinds of codebases. I’m going to assume basic unix commandline familiarity, but nothing too fancy. Make has kind of a reputation for being complex and obscure. In fairness, it certainly can be. If you start down the rabbit hole of writing complex Makefiles, there really doesn’t seem to be a bottom. Its reputation isn’t helped by its association with autoconf, which does deserve its reputation for complexity, but we’ll just avoid going there. But Make can be simple and incredibly useful. There are plenty of tutorials online and I don’t want to repeat them here. To understand nearly everything I do with Make, this sample stanza is pretty much all you need: ruleA: ruleB ruleC command1 command2 A Makefile mainly consists of a set of rules, each of which has a list of zero or more prerequisites, and then zero or more commands. At a very high level, when a rule is invoked, Make first invokes any prerequisite rules (if needed), which may have their own stanzas defined elsewhere) and then runs the designated commands. There’s some nonsense having to do with requiring actual TAB characters in there instead of spaces for indentation and more syntax for comments (lines starting with #) and defining and using variables, but that’s the gist of it right there. So you can put something extremely simple (and pointless) like say_hello: echo make says hello in a file called Makefile, then go into the same directory and do: $ make say_hello
echo make says hello
make says hello

Not exciting, but if you have some complex shell commands that you have to run on a regular basis, it can be convenient to just package them up in a Makefile so you don’t have to remember them or type them out every time.

Things get interesting when you add in the fact that rules are also interpreted as filenames and Make is smart about looking at timestamps and keeping track of dependencies to avoid doing more work than necessary. I won’t give trivial examples to explain that (again, there are other tutorials out there), but the general interpretation to keep in mind is that the first stanza above specifies how to generate or update a file called ruleA. Specifically, other files, ruleB and ruleC need to have first been generated (or are already up to date), then command1 and command2 are run. If ruleA has been updated more recently than ruleB and ruleC, we know it’s up to date and nothing needs to be done. On the other hand, if the ruleB or ruleC files have newer timestamps than ruleA, ruleA needs to be regenerated. There will be meatier examples later, which I think will clarify this.

The original use-case for Make was handling complex dependencies in languages with compilers and linkers. You wanted to avoid re-compiling the entire project when only one source file changed. (If you’ve worked with a large C++ project, you probably understand why that would suck.) Make (with a well-written Makefile) is very good at avoiding unnecessary recompilation while you’re developing.

So back to Django, which is written in Python, which is not a compiled language and shouldn’t suffer from problems related to unnecessary and slow recompilation. Why would Make be helpful here?

While Python is not compiled, a real world Django project has enough complexity involved in day to day tasks that Make turns out to be incredibly useful.

A Django project requires that you’ve installed at least the Django python library. If you’re actually going to access a database, you’ll also need a database driver library like psycopg2. Then Django has a whole ecosystem of pluggable applications that you can use for common functionality, each of which often require a few other miscellaneous Python libraries. On many projects that I work on, the list of python libraries that get pulled in quickly runs up into the 50 to 100 range and I don’t believe that’s uncommon.

I’m a believer in repeatable builds to keep deployment sane and reduce the “but it works on my machine” factor. So each Django project has a requirements.txt that specifies exact version numbers for every python library that the project uses. Running pip install -r requirements.txt should produce a very predictable environment (modulo entirely different OSes, etc.).

Working on multiple projects (between work and personal projects, I help maintain at least 50 different Django projects), it’s not wise to have those libraries all installed globally. The Python community’s standard solution is virtualenv, which does a nice job of letting you keep everything separate. But now you’ve got a bunch of virtualenvs that you need to manage.

I have a seperate little rant here about how the approach of “activating” virtualenvs in a particular shell environment (even via virtualenvwrapper or similar tools) is an antipattern and should be avoided. I’ll spare you most of that except to say that what I recommend instead is sticking with a standard location for your project’s virtualenv and then just calling the python, pip, etc commands in the virtualenv via their path. Eg, on my projects, I always do

$virtualenv ve at the top level of the project. Then I can just do: $ ./ve/bin/python ...

and know that I’m using the correct virtualenv for the project without thinking about whether I’ve activated it yet in this terminal (I’m a person who tends to have lots and lots of terminals open at a given time and often run commands from an ephemeral shell within emacs). For Django, where most of what you do is done via manage.py commands, I actually just change the shebang line in manage.py to #!ve/bin/python so I can always just run ./manage.py do_whatever.

Let’s get back to Make though. If my project has a virtualenv that was created by running something along the lines of:

$virtualenv ve$ ./ve/bin/pip install -r requirements.txt

And requirements.txt gets updated, the virtualenv needs to be updated as well. This is starting to get into Make’s territory. Personally, I’ve encountered enough issues with pip messing up in-place updates on libraries that I prefer to just nuke the whole virtualenv from orbit and do a clean install anytime something changes.

Let’s add a rule to a Makefile that looks like this:

ve/bin/python: requirements.txt
rm -rf ve
virtualenv ve
./ve/bin/pip install -r requirements.txt

If that’s the only rule in the Makefile (or just the first), typing

$make At the terminal will ensure that the virtualenv is all set up and ready to go. On a fresh checkout, ve/bin/python won’t exist, so it will run the three commands, setting everything up. If it’s run at any point after that, it will see that ve/bin/python is more recently updated than requirements.txt and nothing needs to be done. If requirements.txt changes at some point, running make will trigger a wipe and reinstall of the virtualenv. Already, that’s actually getting useful. It’s better when you consider that in a real project, the commands involved quickly get more complicated with specifying a custom --index-url, setting up some things so pip installs from wheels, and I even like to specify exact versions of virtualenv and setuptools in projects I don’t have to think about what might happen on systems with different versions of those installed. The actual commands are complicated enough that I’m quite happy to have them written down in the Makefile so I only need to remember how to type make. It all gets even better again when you realize that you can use ve/bin/python as a prerequisite for other rules. Remember that if a target rule doesn’t match a filename, Make will just always run the commands associated with it. Eg, on a Django project, to run a development server, I might run: $ ./manage.py runserver 0.0.0.0:8001

Instead, I can add a stanza like this to my Makefile:

runserver: ve/bin/python
./manage.py runserver 0.0.0.0:8001

Then I can just type make runserver and it will run that command for me. Even better, since ve/bin/python is a prerequisite for runserver, if the virtualenv for the project hasn’t been created yet (eg, if I just cloned the repo and forgot that I need to install libraries and stuff), it just does that automatically. And if I’ve done a git pull that updated my requirements.txt without noticing, it will automatically update my virtualenv for me. This sort of thing has been incredibly useful when working with designers who don’t necessarily know the ins and outs of pip and virtualenv or want to pay close attention to the requirements.txt file. They just know they can run make runserver and it works (though sometimes it spends a few minutes downloading and installing stuff first).

I typically have a bunch more rules for common tasks set up in a similar fashion:

check: ve/bin/python
./manage.py check

migrate: check
./manage.py migrate

flake8: ve/bin/python
./ve/bin/flake8 project

test: check flake8
./manage.py test

That demonstrates a bit of how rules chain together. If I run make test, check and flake8 are both prerequisites, so they each get run first. They, in turn, both depend on the virtualenv being created so that will happen before anything.

Perhaps you’ve noticed that there’s also a little bug in the ve/bin/python stanza up above. ve/bin/python is created by the virtualenv ve step, but it’s used as the target for the stanza. If the pip install step fails though (because of a temporary issue with PyPI or just a typo in requirements.txt or something), it will still have “succeeded” in that ve/bin/python has a fresher timestamp than requirements.txt. So the virtualenv won’t really have the complete set of libraries installed there but subsequent runs of Make will consider everything fine (based on timestamp comparisons) and not do anything. Other rules that depend on the virtualenv being set up are going to have problems when they run.

I get around that by introducing the concept of a sentinal file. So my stanza actually becomes something like:

ve/sentinal: requirements.txt
rm -rf ve
virtualenv ve
./ve/bin/pip install -r requirements.txt
touch ve/sentinal

Ie, now there’s a zero byte file named ve/sentinal that exists just to signal to Make that the rule was completed successfully. If the pip install step fails for some reason, it never gets created and Make won’t try to keep going until that gets fixed.

My actual Makefile setup on real projects has grown more flexible and more complex, but if you’ve followed most of what’s here, it should be relatively straightforward. In particular, I’ve taken to splitting functionality out into individual, reusable .mk files that are heavily parameterized with variables, which then just get includeed into the main Makefile where the project specific variables are set.

Eg, here is a typical one. It sets a few variables specific to the project, then just does include *.mk.

The actual Django related rules are in django.mk, which is a file that I use across all my django projects and should look similar to what’s been covered here (just with a lot more rules and variables). Other .mk files in the project handle tasks for things like javascript (jshint and jscs checks, plus npm and webpack operations) or docker. They all take default values from config.mk and are set up so those can be overridden in the main Makefile or from the commandline. The rules in these files are a bit more sophisticated than the examples here, but not by much. [I should also point out here that I am by no means a Make expert, so you may not want to view my stuff as best practices; merely a glimpse of what’s possible.]

I typically arrange things so the default rule in the Makefile runs the full suite of tests and linters. Running the tests after every change is the most common task for me, so having that available with a plain make command is nice. As an emacs user, it’s even more convenient since emacs’ default setting for the compile command is to just run make. So it’s always a quick keyboard shortcut away. (I use projectile, which is smart enough to run the compile command in the project root).

Make was originally created in 1976, but I hope you can see that it remains relevant and useful forty years later.

## 2015 Music

By anders pearson 24 Dec 2015

It’s the time of the year when everyone puts out their top albums of 2015 lists. 2015 has been a very good year for the kind of music that I like, so I thought about writing my own up, but decided that I am too lazy to pare it down to just 10 albums or so. Instead, here’s a massive list of 60 albums (and 1 single) that came out in 2015 that I enjoyed. If you like dark, weird music (atmospheric black metal, doom, sludge, noise, and similar), perhaps you might too. I’ve only included albums that are up on Bandcamp (they’re generous with previews and you can buy the album if you like it). There are way too many for me to write about each (and really, what’s the point when you can listen to each of them for yourselves?). I also make no attempt at ranking them, thus they are presented in alphabetical order by artist name.

## A Curated List of London Tube Station Names That Amuse Me

By anders pearson 09 Dec 2015

• Barking
• Chigwell
• Chorleywood
• Cockfosters
• East Ham
• Elephant & Castle
• Goodge Street
• Ickenham
• Upminster
• Uxbridge
• West Ham

Plus Honorable Mention to the Overground station “Wapping”.

## Continuously Deploying Django with Docker

By anders pearson 06 Dec 2015

I run about a dozen personal Django applications (including this site) on some small servers that I admin. I also run a half dozen or so applications written in other languages and other frameworks.

Since it’s a heterogeneous setup and I have a limited amount of free time for side projects, container technology like Docker that lets me standardize my production deployment is quite appealing.

I run a continuous deployment pipeline for all of these applications so every git commit I make to master goes through a test pipeline and ends up deployed to the production servers (assuming all the tests pass).

Getting Django to work smoothly in a setup like this is non-trivial. This post attempts to explain how I have it all working.

### Background

First, some background on my setup. I run Ubuntu 14.04 servers on Digital Ocean. Ubuntu 14.04 still uses upstart as the default init, so that’s what I use to manage my application processes. I back the apps with Postgres and I run an Nginx proxy in front of them. I serve static assets via S3 and Cloudfront. I also use Salt for config management and provisioning so if some of the config files here look a bit tedious or tricky to maintain and keep in sync, keep in mind that I’m probably actually using Salt to template and automate them. I also have a fairly extensive monitoring setup that I won’t go into here, but will generally let me know as soon as anything goes wrong.

I currently have three “application” servers where the django applications themselves run. Typically I run each application on two servers which Nginx load balances between. A few of the applications also use Celery for background jobs and Celery Beat for periodic tasks. For those, the celery and celery beat processes run on the third application server.

My goal for my setup was to be able to deploy new versions of my Django apps automatically and safely just by doing git push origin master (which typically pushes to a github repo). That means that the code needs to be tested, a new Docker image needs to be built, distributed to the application servers, database migrations run, static assets compiled and pushed to S3, and the new version of the application started in place of the old. Preferably without any downtime for the users.

I’ll walk through the setup for my web-based feedreader app, antisocial, since it is one of the ones with Celery processes. Other apps are all basically the same except they might not need the Celery parts.

I should also point out that I am perpetually tweaking stuff. This is what I’m doing at the moment, but it will probably outdated soon after I publish this as I find other things to improve.

### Dockerfile

Let’s start with the Dockerfile:

Dockerfile:

FROM ccnmtl/django.base
RUN apt-get update && apt-get install -y libxml2-dev libxslt-dev
RUN /ve/bin/pip install --no-index -f /wheelhouse -r /wheelhouse/requirements.txt
WORKDIR /app
COPY . /app/
RUN /ve/bin/python manage.py test
EXPOSE 8000
ENV APP antisocial
ENTRYPOINT ["/run.sh"]
CMD ["run"]

Like most, I started using Docker by doing FROM ubuntu:trusty or something similar at the beginning of all my Dockerfiles. That’s not really ideal though and results in large docker images that are slow to work with so I’ve been trying to get my docker images as slim and minimal as possible lately.

Roughly following Glyph’s approach, I split the docker image build process into a base image and a “builder” image so the final image can be constructed without the whole compiler toolchain included. The base and builder images I have published as ccnmtl/django.base and ccnmtl/django.build respectively and you can see exactly how they are made here.

Essentially, they both are built on top of Debian Jessie (quite a bit smaller than Ubuntu images and similar enough). The base image contains the bare minimum while the build image contains a whole toolchain for building wheels out of python libraries. I have a Makefile with some bits like this:

ROOT_DIR:=$(shell dirname$(realpath $(lastword$(MAKEFILE_LIST))))
APP=antisocial
REPO=thraxil
WHEELHOUSE=wheelhouse

$(WHEELHOUSE)/requirements.txt:$(REQUIREMENTS)
mkdir -p $(WHEELHOUSE) docker run --rm \ -v$(ROOT_DIR):/app \
-v $(ROOT_DIR)/$(WHEELHOUSE):/wheelhouse \
ccnmtl/django.build
cp $(REQUIREMENTS)$(WHEELHOUSE)/requirements.txt
touch $(WHEELHOUSE)/requirements.txt build:$(WHEELHOUSE)/requirements.txt
docker build -t $(IMAGE) . So when I do make build, if the requirements.txt has changed since the last time, it uses the build image to generate a directory with wheels for every library specified in requirements.txt, then runs docker build, which can do a very simple (and fast) pip install of those wheels. Once the requirements are installed, it runs the application’s unit tests. I expose port 8000 and copy in a custom script to use as an entry point. ### docker-run.sh That script makes the container a bit easier to work with. It looks like this: #!/bin/bash cd /app/ if [[ "$SETTINGS" ]]; then
export DJANGO_SETTINGS_MODULE="$APP.$SETTINGS"
else
export DJANGO_SETTINGS_MODULE="$APP.settings_docker" fi if [ "$1" == "migrate" ]; then
exec /ve/bin/python manage.py migrate --noinput
fi

if [ "$1" == "collectstatic" ]; then exec /ve/bin/python manage.py collectstatic --noinput fi if [ "$1" == "compress" ]; then
exec /ve/bin/python manage.py compress
fi

if [ "$1" == "shell" ]; then exec /ve/bin/python manage.py shell fi if [ "$1" == "worker" ]; then
exec /ve/bin/python manage.py celery worker
fi

if [ "$1" == "beat" ]; then exec /ve/bin/python manage.py celery beat fi # run arbitrary commands if [ "$1" == "manage" ]; then
shift
exec /ve/bin/python manage.py "$@" fi if [ "$1" == "run" ]; then
exec /ve/bin/gunicorn --env \
DJANGO_SETTINGS_MODULE=$DJANGO_SETTINGS_MODULE \$APP.wsgi:application -b 0.0.0.0:8000 -w 3 \
--access-logfile=- --error-logfile=-
fi

With the ENTRYPOINT and CMD set up that way in the Dockerfile, I can just run

$docker run thraxil/antisocial and it will run the gunicorn process, serving the app on port 8000. Or, I can do: $ docker run thraxil/antisocial migrate

and it will run the database migration task. Similar for collectstatic, compress, celery, etc. Or, I can do:

$docker run thraxil/antisocial manage some_other_command --with-flags to run any other Django manage.py command (this is really handy for dealing with migrations that need to be faked out, etc.) ### docker-runner Of course, all of those exact commands would run into problems with needing various environment variables passed in, etc. The settings_docker settings module that the script defaults to for the container is a fairly standard Django settings file, except that it pulls all the required settings out of environment variables. The bulk of it comes from a common library that you can see here. This gives us a nice twelve-factor style setup and keeps the docker containers very generic and reusable. If someone else wants to run one of these applications, they can pretty easily just run the same container and just give it their own environment variables. The downside though is that it gets a bit painful to actually run things from the commandline, particularly for one-off tasks like database migrations because you actually need to specify a dozen or so -e flags on every command. I cooked up a little bit of shell script with a dash of convention over configuration to ease that pain. All the servers get a simple docker-runner script that looks like: #!/bin/bash APP=$1
shift

IMAGE=
OPTS=
if [ -f /etc/default/$APP ]; then . /etc/default/$APP
fi
TAG="latest"
if [ -f /var/www/$APP/TAG ]; then . /var/www/$APP/TAG
fi

exec /usr/bin/docker run $OPTS$EXTRA $IMAGE:$TAG $@ That expects that every app has a file in /etc/default that defines an $IMAGE and $OPTS variable. Eg, antisocial’s looks something like: /etc/default/antisocial: export IMAGE="thraxil/antisocial" export OPTS="--link postfix:postfix \ --rm \ -e SECRET_KEY=some_secret_key \ -e AWS_S3_CUSTOM_DOMAIN=d115djs1mf98us.cloudfront.net \ -e AWS_STORAGE_BUCKET_NAME=s3-bucket-name \ -e AWS_ACCESS_KEY=... \ -e AWS_SECRET_KEY=... \ ... more settings ... \ -e ALLOWED_HOSTS=.thraxil.org \ -e BROKER_URL=amqp://user:pass@host:5672//antisocial" With that in place, I can just do: $ docker-runner antisocial migrate

And it fills everything in. So I can keep the common options in one place and not have to type them in every time.

(I’ll get to the TAG file that it mentions in a bit)

### upstart

With those in place, the upstart config for the application can be fairly simple:

/etc/init/antisocial.conf:

description "start/stop antisocial docker"

start on filesystem and started docker-postfix and started registrator
stop on runlevel [!2345]

respawn

script
export EXTRA="-e SERVICE_NAME=antisocial -p 192.81.1.1::8000"
exec /usr/local/bin/docker-runner antisocial
end script

The Celery and Celery Beat services have very similar ones except they run celery and beat tasks instead and they don’t need to have a SERVICE_NAME set or ports configured.

### Consul

Next, I use consul, consul-template, and registrator to rig everything up so Nginx automatically proxies to the appropriate ports on the appropriate application servers.

Each app is registered as a service (hence the SERVICE_NAME parameter in the upstart config). Registrator sees the containers starting and stopping and registers and deregisters them with consul as appropriate, inspecting them to get the IP and port info.

consul-template runs on the Nginx server and has a template defined for each app that looks something like:

{{if service "antisocial"}}
upstream antisocial {
{{end}}
}
{{end}}

server {
listen   80;
server_name  feeds.thraxil.org;

client_max_body_size 40M;
{{if service "antisocial"}}
location / {
proxy_pass http://antisocial;
proxy_connect_timeout   2;
proxy_set_header        Host            $host; proxy_set_header X-Real-IP$remote_addr;
}
error_page 502 /502.html;
location = /502.html {
root   /var/www/down/;
}
{{else}}
root /var/www/down/;
try_files $uri /502.html; {{end}} } That just dynamically creates an endpoint for each running app instance pointing to the right IP and port. Nginx then round-robins between them. If none are running, it changes it out to serve a “sorry, the site is down” kind of page instead. Consul-template updates the nginx config and reloads nginx as soon as any changes are seen to the service. It’s really nice. If I need more instances of a particular app running, I can just spin one up on another server and it instantly gets added to the pool. If one crashes or is shut down, it’s removed just as quickly. As long as one there’s at least one instance running at any given time, visitors to the site should never be any the wiser (as long as it can handle the current traffic). That really covers the server and application setup. What’s left is the deployment part. Ie, how it gets from a new commit on master to running on the application servers. ### Jenkins Jenkins is kind of a no-brainer for CI/CD stuff. I could probably rig something similar up with TravisCI or Wercker or another hosted CI, but I’m more comfortable keeping my credentials on my own servers for now. So I have a Jenkins server running and I have a job set up there for each application. It gets triggered by a webhook from github whenever there’s a commit to master. Jenkins checks out the code and runs: export TAG=build-$BUILD_NUMBER
make build
docker push thraxil/antisocial:$TAG $BUILD_NUMBER is a built-in environment variable that Jenkins sets on each build. So it’s just building a new docker image (which runs the test suite as part of the build process) and pushes it to the Docker Hub with a unique tag corresponding to this build.

When that completes successfully, it triggers a downstream Jenkins job called django-deploy which is a parameterized build. It passes it the following parameters:

APP=antisocial
TAG=build-$BUILD_NUMBER HOSTS=appserver1 appserver2 CELERY_HOSTS=appserver3 BEAT_HOSTS=appserver3 These are fairly simple apps that I run mostly for my own amusement so I don’t have extensive integration tests. If I did, instead of triggering django-deploy directly here, I would trigger other jobs to run those tests against the newly created and tagged image first. The django-deploy job runs the following script: #!/bin/bash hosts=(${HOSTS})
chosts=(${CELERY_HOSTS}) bhosts=(${BEAT_HOSTS})

for h in "${hosts[@]}" do ssh$h docker pull ${REPOSITORY}thraxil/$APP:$TAG ssh$h cp /var/www/$APP/TAG /var/www/$APP/REVERT || true
ssh $h "echo export TAG=$TAG > /var/www/$APP/TAG" done for h in "${chosts[@]}"
do
ssh $h docker pull${REPOSITORY}thraxil/$APP:$TAG
ssh $h cp /var/www/$APP/TAG /var/www/$APP/REVERT || true ssh$h "echo export TAG=$TAG > /var/www/$APP/TAG"
done

for h in "${bhosts[@]}" do ssh$h docker pull ${REPOSITORY}thraxil/$APP:$TAG ssh$h cp /var/www/$APP/TAG /var/www/$APP/REVERT || true
ssh $h "echo export TAG=$TAG > /var/www/$APP/TAG" done h=${hosts[0]}

ssh $h /usr/local/bin/docker-runner$APP migrate
ssh $h /usr/local/bin/docker-runner$APP collectstatic
ssh $h /usr/local/bin/docker-runner$APP compress

for h in "${hosts[@]}" do ssh$h sudo stop $APP || true ssh$h sudo start $APP done for h in "${chosts[@]}"
do
ssh $h sudo stop$APP-worker || true
ssh $h sudo start$APP-worker
done

for h in "${bhosts[@]}" do ssh$h sudo stop $APP-beat || true ssh$h sudo start $APP-beat done It’s a bit long, but straightforward. First, it just pulls the new docker image down onto each server. This is done first because the docker pull is usually the slowest part of the process. Might as well get it out of the way first. On each host, it also writes to the /var/www/$APP/TAG file that we saw mentioned back in docker-runner. The contents are just a variable assignment specifying the tag that we just built and pulled and are about to cut over to. The docker-runner script knows to use the specific tag that’s set in that file. Of course, it first backs up the old one to a REVERT file that can then be used to easily roll-back the whole deploy if something goes wrong.

Next, the database migrations and static asset tasks have to run. They only need to run on a single host though, so it just pulls the first one off the list and runs the migrate, collectstatic, and compress on that one.

Finally, it goes host by host and stops and starts the service on each in turn. Remember that with the whole consul setup, as long as they aren’t all shut off at the same time, overall availability should be preserved.

Then, of course, it does the same thing for the celery and celery beat services.

If that all completes successfully, it’s done. If it fails somewhere along the way, another Jenkins job is triggered that basically restores the TAG file from REVERT and restarts the services, putting everything back to the previous version.

### Conclusion and Future Directions

That’s a lot to digest. Sorry. In practice, it really doesn’t feel that complicated. Mostly stuff just works and I don’t have to think about it. I write code, commit, and git push. A few minutes later I get an email from Jenkins telling me that it’s been deployed. Occasionally, Jenkins tells me that I broke something and I go investigate and fix it (while the site stays up). If I need more capacity, I provision a new server and it joins the consul cluster. Then I can add it to the list to deploy to, kick off a Jenkins job and it’s running. I’ve spent almost as much time writing this blog post explaining everything in detail as it took to actually build the system.

Provisioning servers is fast and easy because they barely need anything installed on them besides docker and a couple config files and scripts. If a machine crashes, the rest are unaffected and service is uninterrupted. Overall, I’m pretty happy with this setup. It’s better than the statically managed approach I had before (no more manually editing nginx configs and hoping I don’t introduce a syntax error that takes all the sites down until I fix it).

Nevertheless, what I’ve put together is basically a low rent, probably buggy version of a PaaS. I knew this going in. I did it anyway because I wanted to get a handle on all of this myself. (I’m also weird and enjoy this kind of thing). Now that I feel like I really understand the challenges here, when I get time, I’ll probably convert it all to run on Kubernetes or Mesos or something similar.

## Docker and Upstart

By anders pearson 03 Nov 2015

Docker has some basic process management functionality built in. You can set restart policies and the Docker daemon will do its best to keep containers running and restart them if the host is rebooted.

This is handy and can work well if you live in an all Docker world. Many of us need to work with Docker based services alongside regular non-Docker services on the same host, at least for the near future. Our non-Docker services are probably managed with Systemd, Upstart, or something similar and we’d like to be able to use those process managers with our Docker services so dependencies can be properly resolved, etc.

I haven’t used Systemd enough to have an opinion on it (according to the internet, it’s either the greatest thing since sliced bread or the arrival of the antichrist, depending on who you ask). Most of the machines I work on are still running Ubuntu 14.04 and Upstart is the path of least resistence there and the tool that I know the best.

Getting Docker and Upstart to play nicely together is not quite as simple as it appears at first.

Docker’s documentation contains a sample upstart config:

description "Redis container"
author "Me"
start on filesystem and started docker
stop on runlevel [!2345]
respawn
script
/usr/bin/docker start -a redis_server
end script

That works, but it assumes that the container named redis_server already exists. Ie, that someone has manually, or via some mechanism outside upstart run the docker run --name=redis_server ... command (or a docker create), specifying all the parameters. If you need to change one of those parameters, you would need to stop the upstart job, do a docker stop redis_server, delete the container with docker rm redis_server, run a docker create --name=redis_server ... to make the new container with the updated parameters, then start the upstart job.

That’s a lot of steps and would be no fun to automate via your configuration management or as part of a deployment. What I expect to be able to do with upstart is deploy the necessary dependencies and configuration to the host, drop an upstart config file in /etc/init/myservice.conf and do start myservice, stop myservice, etc. I expect to be able to drop in a new config file and just restart the service to have it go into effect. Letting Docker get involved seems to introduce a bunch of additional steps to that process that just get in the way.

Really, to get Docker and Upstart to work together properly, it’s easier to just let upstart do everything and configure Docker to not try to do any process management.

First, make sure you aren’t starting the Docker daemon with --restart=always. The default is --restart=no and is what we want.

Next, instead of building the container and then using docker start and docker stop even via upstart, we instead want to just use docker run so parameters can be specified right there in the upstart config (I’m going to leave out the description/author stuff from here on out):

start on filesystem and started docker
stop on runlevel [!2345]
respawn
script
/usr/bin/docker run \
-v /docker/host/dir:/data \
redis
end script

This will work nicely. You can stop and start via upstart like a regular system service.

Of course, we would probably like other services to be able to link to it and for that it will need to be named:

start on filesystem and started docker
stop on runlevel [!2345]
respawn
script
/usr/bin/docker run \
--name=redis_server \
-v /docker/host/dir:/data \
redis
end script

That will work.

Once.

Then we run into the issue that anyone who’s used Docker and tried to run named containers has undoubtedly come across. If we stop that and try to start it again, it will fail and the logs will be full of complaints about:

Error response from daemon: Conflict. The name "redis_server" is
already in use by container 9ccc57bfbc3c. You have to delete (or
rename) that container to be able to reuse that name.

Then you have to do the whole dance of removing the container and restarting stuff. So we put a --rm in there:

start on filesystem and started docker
stop on runlevel [!2345]
respawn
script
/usr/bin/docker run \
--rm \
--name=redis_server \
-v /docker/host/dir:/data \
redis
end script

This is much better and will mostly work.

Sometimes, though, the container will get killed without a proper SIGTERM signal getting through and Docker won’t clean up the container. Eg, it gets OOM-killed, or the server is abruptly power-cycled (it seems like sometimes even a normal stop just doesn’t work right). The old container is left there and the next time it tries to start, you run into the same old conflict and have to manually clean it up.

There are numerous Stack Overflow questions and similar out there with suggestions for pre-stop stanzas, etc. to deal with this problem. However, in my experimentation, they all failed to reliably work across some of those trickier situations like OOM-killer rampages and power cycling.

My current solution is simple and has worked well for me. I just add couple more lines added to the script section like so:

script
/usr/bin/docker stop redis_server || true
/usr/bin/docker rm redis_server || true
/usr/bin/docker run \
--rm \
--name=redis_server \
-v /docker/host/dir:/data \
redis
end script

In the spirit of the World’s Funniest Joke, before trying to revive the container, we first make sure it’s dead. The || true on each of those lines just ensures that it will keep going even if it didn’t actually have to stop or remove anything. (I suppose that the --rm is now superfluous, but it doesn’t hurt).

So this is how I run things now. I tend to have two levels of Docker containers: these lower level named services that get linked to other containers, and higher level “application” containers (eg, Django apps), that don’t need to be named, but probably link in one of these named containers. This setup is easy to manage (my CM setup can push out new configs and restart services with wild abandon) and robust.

## Circuit Breaker TCP Proxy

By anders pearson 23 Jul 2015

When you’re building systems that combine other systems, one of the big lessons you learn is that failures can cause chain reactions. If you’re not careful, one small piece of the overall system failing can cause catastrophic global failure.

Even worse, one of the main mechanisms for these chain reactions is a well-intentioned attempt by one system to cope with the failure of another.

Imagine Systems A and B, where A relies on B. System A needs to handle 100 requests per second, which come in at random, but normal intervals. Each of those implies a request to System B. So an average of one request every 10ms for each of them.

System A may decide that if a request to System B fails, it will just repeatedly retry the request until it succeeds. This might work fine if System B has a very brief outage. But then it goes down for longer and needs to be completely restarted, or is just becomes completely unresponsive until someone steps in and manually fixes it. Let’s say that it’s down for a full second. When it finally comes back up, it now has to deal with 100 requests all coming in at the same time, since System A has been piling them up. That usually doesn’t end well, making System B run extremely slowly or even crash again, starting the whole cycle over and over.

Meanwhile, since System A is having to wait unusually long for a successful response from B, it’s chewing up resources as well and is more likely to crash or start swapping. And of course, anything that relies on A is then affected and the failure propagates.

Going all the way the other way, with A never retrying and instead immediately noting the failure and passing it back is a little better, but still not ideal. It’s still hitting B every time while B is having trouble, which probably isn’t helping B’s situation. Somewhere back up the chain, a component that calls system A might be retrying, and effectively hammers B the same as in the previous scenario, just using A as the proxy.

A common pattern for dealing effectively with this kind of problem is the “circuit breaker”. I first read about it in Michael Nygaard’s book Release It! and Martin Fowler has a nice in-depth description of it.

As the name of the pattern should make clear, a circuit breaker is designed to detect a fault and immediately halt all traffic to the faulty component for a period of time, to give it a little breathing room to recover and prevent cascading failures. Typically, you set a threshold of errors and, when that threshold is crossed, the circuit breaker “trips” and no more requests are sent for a short period of time. Any clients get an immediate error response from the circuit breaker, but the ailing component is left alone. After that short period of time, it tentatively allows requests again, but will trip again if it sees any failures, this time blocking requests for a longer period. Ideally, these time periods of blocked requests will follow an exponential backoff, to minimize downtime while still easing the load as much as possible on the failed component.

Implementing a circuit breaker around whatever external service you’re using usually isn’t terribly hard and the programming language you are using probably already has a library or two available to help. Eg, in Go, there’s this nice one from Scott Barron at Github and this one which is inspired by Hystrix from Netflix, which includes circuit breaker functionality. Recently, Heroku released a nice Ruby circuit breaker implementation.

That’s great if you’re writing everything yourself, but sometimes you have to deal with components that you either don’t have the source code to or just can’t really modify easily. To help with those situations, I wrote cbp, which is a basic TCP proxy with circuit breaker functionality built in.

You tell it what port to listen on, where to proxy to, and what threshold to trip at. Eg:

\$ cbp -l localhost:8000 -r api.example.com:80 -t .05

sets up a proxy to port 80 of api.example.com on the local port 8000. You can then configure a component to make API requests to http://localhost:8000/. If more than 5% of those requests fail, the circuit breaker trips and it stops proxying for 1 second, 2 seconds, 4 seconds, etc. until the remote service recovers.

This is a raw TCP proxy, so even HTTPS requests or any other kind of (TCP) network traffic can be sent through it. The downside is that that means it can only detect TCP level errors and remains ignorant of whatever protocol is on top of it. That means that eg, if you proxy to an HTTP/HTTPS service and that service responds with a 500 or 502 error response, cbp doesn’t know that that is considered an error, sees a perfectly valid TCP request go by, and assumes everything is fine; it would only trip if the remote service failed to establish the connection at all. (I may make an HTTP specific version later or add optional support for HTTP and/or other common protocols, but for now it’s plain TCP).