Updated: October 21, 2009
So, some of the best ideas come from thoughts like, "Why can't this be any faster?"
Our first project, which handles peak traffic of 650,000 page views per hour on a $38 slice, runs vanilla apache2 (with some configuration tweaks) with mod_python and memcached. No reverse proxy. No database cluster. Suffice to say, we've got room to scale vertically if we need to.
By our second project, we wanted to try something new. So, we took our existing apache2 configuration and replaced mod_python with mod-wsgi. We picked up a nice performance boost and cut down dramatically on the amount of RAM each apache process used.
Third project? Needed to be faster still. So, we stuck with mod-wsgi, but ran it in daemon mode so that apache wasn't controlling the number of processes that get spun up under load. Again, some performance gains.
However, none of these approaches gave us the freakish response time and scalability that we wanted for future projects. Enter Tornado. Lightweight, fast, and non-blocking (well, it would be if we weren't using it to serve wsgi), Tornado sounded great.
Here's what I wanted:
The prerequisites: You need to install Tornado, for which there are comprehensive instructions available. You'll need to have an already-working Django project, and you'll need to make sure Apache (or any other Web server/service that might be bigfooting your ports) have been securely knocked out for the proceedings.
First, we need to write a small executable script for each Tornado instance we want. Since Tornado is single-threaded, you'll want to run ~ one instance for each core. Brett Taylor mentioned that they've got some machines running more than one instance per core during a tech talk he gave, but YMMV. Here's the script I used, which borrows from this particular snippet.
#! /usr/bin/env python
import os
import tornado.httpserver
import tornado.ioloop
import tornado.wsgi
import sys
import django.core.handlers.wsgi
sys.path.append('/path/to/your/projects/folder/')
def main():
os.environ['DJANGO_SETTINGS_MODULE'] = 'foo.settings'
application = django.core.handlers.wsgi.WSGIHandler()
container = tornado.wsgi.WSGIContainer(application)
http_server = tornado.httpserver.HTTPServer(container)
http_server.listen(8000)
tornado.ioloop.IOLoop.instance().start()
if __name__ == "__main__":
main()
Note, you'll want to vary the port; I chose 8000-8003 to run on our test server. You'll want to make these executable using chmod.
Next, you'll need to configure your proxy. I chose Nginx because a.) we were running low on Soviet-produced lightweight Web server processes and b.) it has the coolest name. On Ubuntu 8.04 LTS (our distro-of-choice), Nginx is available via apt-get. The only difficult part is the configuration file. This borrows heavily from the Nginx configuration available on the Tornado web site.
user www-data;
worker_processes 1;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
use epoll;
}
http {
# Enumerate all the Tornado servers here
upstream frontends {
server 127.0.0.1:8000;
server 127.0.0.1:8001;
server 127.0.0.1:8002;
server 127.0.0.1:8003;
}
include /etc/nginx/mime.types;
default_type application/octet-stream;
access_log /var/log/nginx/access.log;
keepalive_timeout 65;
proxy_read_timeout 200;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
gzip on;
gzip_min_length 1000;
gzip_proxied any;
gzip_types text/plain text/html text/css text/xml
application/x-javascript application/xml
application/atom+xml text/javascript;
proxy_next_upstream error;
server {
listen 80;
# Allow file uploads
client_max_body_size 50M;
location ^~ /static/ {
root /var/www;
if ($query_string) {
expires max;
}
}
location = /favicon.ico {
rewrite (.*) /static/favicon.ico;
}
location = /robots.txt {
rewrite (.*) /static/robots.txt;
}
location / {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect false;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_pass http://frontends;
}
}
}
Finally, you'll need to start up Nginx through one of the normal methods (e.g., init.d) and then fire up your Tornado scripts. Those can just be executed like this.
python /path/to/your_script_port_number1.py /path/to/your_script_port_number2.py
Next post: How the benchmarks fared.