XOR Media

Coding, Operations, Etc.

High Performance Web: Asynchronous HTTP

Posted on Mon 02 September 2013 by


The secret to building high performance sites which depend on external web services is asynchronous HTTP. The trick to asynchronous HTTP (or anything with the exception of UI) is to avoid callback hell. Enter futures/promise objects. When used correctly they make doing substantial asynchronous IO, relatively straightforward.

It took me only a week or two of my tenure at Amazon to run across the mechanisms that allow its home page to depend on dozens of external (web) service calls without falling over and devolving in to utter chaos. Simplified for explanation, the page generation consists of two major phases. The first allows the components of the page to fire off their individual requests for data. Then comes page rendering. During this phase the pieces of the page are individually rendering, waiting if necessary, though ideally each section's data is ready and waiting when it's turn comes.

Reality Check

I'll let you in on a secret. You're not building Amazon.com, neither in complexity nor scale. If you get there you'll have a whole team to build the sorts of frameworks they have to make what they do possible. What you can do is leverage a simplfied version of the same concept to improve the performance and stability of your projects.


python requests is a well done library and if you're using python and need to make a HTTP request it's the way to go. If you using requests and would like to explore an asynchronous model requests-futures (note: I'm the author) makes the process as straightforward as possible.

The idea for requests-futures came to me while sitting in a talk at PyCon 2013 that covered concurrent.futures. I started on it then and there and within a few hours of hacking had pushed it up to github as a public repo. Not a whole lot has changed since that first push, a few bug fixes and documentation improvements. It's so simple thanks to concurrent.futures that it "just works."

It's probably time for an example. We'll take the following set of serial requests and convert them to happen in parallel.

from requests import Session

session = Session()
# first requests starts and blocks until finished
response_one = session.get('http://httpbin.org/get')
# second request starts once first is finished
response_two = session.get('http://httpbin.org/get?foo=bar')
# both requests are complete
print('response one status: {0}'.format(response_one.status_code))
print('response two status: {0}'.format(response_two.status_code))

To make the same two requests in parallel we'll switch to a FuturesSession and instead of the get call returning a response object it'll return a Futures object. The result method can be called on that Futures object to retrieve the response. That's it, the only noticeable API difference is that a future object is returned in place of the response.

from requests_futures.sessions import FuturesSession

session = FuturesSession()
# first request is started in the background
future_one = session.get('http://httpbin.org/get')
# second requests is started immediately as well
future_two = session.get('http://httpbin.org/get?foo=bar')
# wait for the first request to complete, if it hasn't already
response_one = future_one.result()
print('response one status: {0}'.format(response_one.status_code))
# wait for the second request to complete, if it hasn't already
response_two = future_two.result()
print('response two status: {0}'.format(response_two.status_code))

Applying it to the Web

So that's pretty simple, but how would we go about applying it in a real-world context, preferably one involved in rendering a web page. The following example will show how this can be applied in to a Django request.


def simple_view(request):

    # NOTE: I often used middleware to install a persistent session on to
    # the request object, but that's be omitted here
    session = FuturesSession()

    # these requests are being made to grab data that we'll pass in as
    # context to the template, since httpbin.org is over the internet, the
    # requests can take a non-trivial amount of time, good thing they're
    # both happening in parallel rather than one and then the other
    future_one = session.get('http://httpbin.org/get')
    future_two = session.get('http://httpbin.org/get?foo=bar')
    # we could have any number of requests happening here

    response_one = future_one.result()
    response_two = future_two.result()

    data = {'response_one': response_one.content,
            'response_two': response_two.content}
    return render_to_response('app/simple_view.html', data,


While simple to use, futures are powerful and the above examples just scratch the surface. You can see a few more examples in my post from a few months ago about statsd. There's a lot more that we could go in to on the subject including: how to handle dependencies between requests, tracking performance/request-blocking, and general tuning of complex request flows. We'll leave things here for now, but feel free to tell me what you'd like to hear more about in the comments below.

About the Author

Ross McFarland Ross McFarland | | |

Ross is a 17 year veteran of the software industry with experience spanning low-level signal processing, web and mobile user interfaces, high-scale distributed web services, infrastructure, and networking. He has made extensive contributions to open source highlighted by his time as a primary maintainer of Gtk2-Perl and author of requests-futures and python-asynchttp libraries. (more)