Task Queues

Task queues manage background work that must be executed outside the usual HTTP request-response cycle.

Why are task queues necessary?

Tasks are handled asynchronously either because they are not initiated by an HTTP request or because they are long-running jobs that would dramatically reduce the performance of an HTTP response.

For example, a web application could poll the GitHub API every 10 minutes to collect the names of the top 100 starred repositories. A task queue would handle invoking code to call the GitHub API, process the results and store them in a persistent database for later use.

Another example is when a database query would take too long during the HTTP request-response cycle. The query could be performed in the background on a fixed interval with the results stored in the database. When an HTTP request comes in that needs those results a query would simply fetch the precalculated result instead of re-executing the longer query. This precalculation scenario is a form of caching enabled by task queues.

Other types of jobs for task queues include

spreading out large numbers of independent database inserts over time instead of inserting everything at once
aggregating collected data values on a fixed interval, such as every 15 minutes
scheduling periodic jobs such as batch processes

Task queue projects

The defacto standard Python task queue is Celery. The other task queue projects that arise tend to come from the perspective that Celery is overly complicated for simple use cases. My recommendation is to put the effort into Celery's reasonable learning curve as it is worth the time it takes to understand how to use the project.

The Celery distributed task queue is the most commonly used Python library for handling asynchronous tasks and scheduling.
The RQ (Redis Queue) is a simple Python library for queueing jobs and processing them in the background with workers. RQ is backed by Redis and is designed to have a low barrier to entry.
Taskmaster is a lightweight simple distributed queue for handling large volumes of one-off tasks.
Huey is a Redis-based task queue that aims to provide a simple, yet flexible framework for executing tasks. Huey supports task scheduling, crontab-like repeating tasks, result storage and automatic retry in the event of failure.
Kuyruk is simple and easy to use task queue system built on top of RabbitMQ. Although feature set is small, new features can be added by extensions.
Dramatiq is a fast and reliable alternative to Celery. It supports RabbitMQ and Redis as message brokers.
django-carrot is a simple task queue specifically for Django that can serve when Celery is overkill.
tasq is a brokerless task queue for simple use cases. It is not recommended for production unless further testing and development is done.

Hosted message and task queue services

Task queue third party services aim to solve the complexity issues that arise when scaling out a large deployment of distributed task queues.

Iron.io is a distributed messaging service platform that works with many types of task queues such as Celery. It also is built to work with other IaaS and PaaS environments such as Amazon Web Services and Heroku.
Amazon Simple Queue Service (SQS) is a set of five APIs for creating, sending, receiving, modifying and deleting messages.
CloudAMQP is at its core managed servers with RabbitMQ installed and configured. This service is an option if you are using RabbitMQ and do not want to maintain RabbitMQ installations on your own servers.

Open source examples that use task queues

flask-celery-example is a simple Flask application with Celery as a task queue and Redis as the broker.
django_dramatiq_example and flask_dramatiq_example are simple apps that demo how you can use Dramatiq with Django and Flask, respectively.

Task queue resources

International Space Station notifications with Python and Redis Queue (RQ) shows how to combine the RQ task queue library with Flask to send text message notifications every time a condition is met - in this blog post's case that the ISS is currently flying over your location on Earth.
Evaluating persistent, replicated message queues is a detailed comparison of Amazon SQS, MongoDB, RabbitMQ, HornetQ and Kafka's designs and performance.
Why Task Queues is a presentation for what task queues are and why they are needed.
Asynchronous Processing in Web Applications Part One and Part Two are great reads for understanding the difference between a task queue and why you shouldn't use your database as one.
Flask by Example Implementing a Redis Task Queue provides a detailed walkthrough of setting up workers to use RQ with Redis.
Heroku has a clear walkthrough for using RQ for background tasks.
How to use Celery with RabbitMQ is a detailed walkthrough for using these tools on an Ubuntu VPS.
Celery - Best Practices explains things you should not do with Celery and shows some underused features for making task queues easier to work with.
Celery in Production on the Caktus Group blog contains good practices from their experience using Celery with RabbitMQ, monitoring tools and other aspects not often discussed in existing documentation.
A 4 Minute Intro to Celery is a short introductory task queue screencast.
This Celery tasks checklist has some nice tips and resources for using Celery in your applications.
Heroku wrote about how to secure Celery when tasks are otherwise sent over unencrypted networks.
Miguel Grinberg wrote a nice post on using the task queue Celery with Flask. He gives an overview of Celery followed by specific code to set up the task queue and integrate it with Flask.
Ditching the Task Queue for Gevent explains how in some cases you can replace the complexity of a task queue with concurrency. For example, you can remove Celery in favor of gevent.
3 Gotchas for Working with Celery are things to keep in mind when you're new to the Celery task queue implementation.
Setting up an asynchronous task queue for Django using Celery and Redis is a straightforward tutorial for setting up the Celery task queue for Django web applications using the Redis broker on the back end.
Asynchronous Tasks with Flask and Redis Queue looks at how to configure Redis Queue to handle long-running tasks in a Flask app.
Developing an Asynchronous Task Queue in Python looks at how to implement several asynchronous task queues using Python's multiprocessing library and Redis.

Task queue learning checklist

Pick a slow function in your project that is called during an HTTP request.
Determine if you can precompute the results on a fixed interval instead of during the HTTP request. If so, create a separate function you can call from elsewhere then store the precomputed value in the database.
Read the Celery documentation and the links in the resources section below to understand how the project works.
Install a message broker such as RabbitMQ or Redis and then add Celery to your project. Configure Celery to work with the installed message broker.
Use Celery to invoke the function from step one on a regular basis.
Have the HTTP request function use the precomputed value instead of the slow running code it originally relied upon.

What's next to learn after task queues?

How do I log errors that occur in my application?

I want to learn more about app users via web analytics.

What tools exist for monitoring a deployed web app?

1. Introduction 2. Development Environments 3. Data 4. Web Development Web Frameworks Django Flask Bottle Pyramid TurboGears Falcon Morepath Sanic Other Web Frameworks Template Engines Jinja2 Mako Django Templates Web Design HTML Cascading Style Sheets (CSS) Responsive Design Minification CSS Frameworks Bootstrap Foundation JavaScript React Vue.js Angular Task Queues Celery Redis Queue (RQ) Dramatiq Static Site Generators Pelican Lektor MkDocs Testing Unit Testing Integration Testing Debugging Code Metrics Networking HTTPS WebSockets WebRTC Web APIs Microservices Webhooks Bots API Creation API Frameworks Django REST Framework API Integration Twilio Stripe Slack Okta Security SQL Injection CSRF 5. Deployment 6. DevOps Changelog What Full Stack Means About the Author Future Directions Page Statuses ...or view the full table of contents.

Full Stack Python

Full Stack Python is an open book that explains concepts in plain language and provides helpful resources for those topics.

Updates via Twitter & Facebook.

Chapters

1. Introduction 2. Development Environments 3. Data 4. Web Development » Task Queues 5. Deployment 6. DevOps Changelog What Full Stack Means About the Author Future Directions Page Statuses ...or view the full table of contents.