Skip to content

CELERY ‐ DISTRIBUTED TASK QUEUE

Irvine Sunday edited this page Jul 5, 2025 · 1 revision
  • Celery is a simple, flexible, and reliable distributed system to process vast amounts of messages, while providing operations with tools required to maintain such a system.
  • Its a distributed task queue system that allows you to execute tasks asynchronously across multiple workers. It's practically useful for handling time-consuming operations without blocking your web application.
  • It's a task queue with focus on real-time processing, while supporting task scheduling.

WHY USE CELERY?

  1. Asynchronous Processing: Hadle long running tasks without blocking HTTP requests.
  2. Scalability: Work can be distributed across multiple machines.
  3. Reliability: Celery has built-n retry and error handling mechanisms.
  4. Flexibility: Celery can support various message brokers and result backends.

Architecture Overview

Web Application → Message Broker (RabbitMQ) → Celery Workers → Result Backend
  • A celery system can consist of multiple workers and brokers, giving way to high availability and horizontal scaling.
  • Calery is written in Python, but the protocol can be implemented in any language.
  • Language interoperability can also be achieved by exposing an HTTP endpoint and having a task that requests it (webhooks)
  • Celery requires a message transport to send and receive messages. The RabbitMQ and Redis broker transports are feature complete.
  • Celery can run on a single machine, on multiple machines, or even across data centers.

Common Use Cases

  • Email Processing: Sending bulk emails without blocking web requests
  • Image Processing: Resizing, thumbnails, format conversion
  • Data Analysis: Long-running calculations and reports
  • API Integrations: Third-party API calls with retry logic
  • Periodic Tasks: Scheduled maintenance, backups, cleanup

Installing Celery Celery is on the Python Package Index (PyPI), so it can be installed with standard Python tools like pip: pip install celery

TASK QUEUE?

  • A task queue is a data structure that holds tasks waiting to be processed, combined with a mechanism to distribute these tasks to available workers for execution.
  • It is the central coordination system that enables asynchronous, distributed task processing.

Key Components

  1. Task
  • A unit of work to be executed, defined as a pyhton function with the @shared_task decorator containing the actual code logic and parameters.
  1. Queue
  • This is a named buffer that holds tasks waiting for execution. Tasks are added to the queue and removed by workers following the First In First Out (FIFO) principle by default.
  1. Message Broker
  • The middleware that manages queues and message delivery e.g. RabbitMQ, Redis, Amazon SQS.
  • Handles task persistence, routing, and delivery guarantees.
  1. Worker
  • A process that consumes tasks from queues and executes the task code.
  • It can run on the same machine or distributed across multiple machines

Working Example

# Example of task queue flow
from celery import shared_task

@shared_task
def process_email(email_data):
    # This function becomes a task
    send_email(email_data)
    return "Email sent"

# When called with .delay(), it goes to the queue
result = process_email.delay({"to": "user@example.com", "subject": "Hello"})

Flow

  • Task is created and serialized
  • Message broker receives the task and places it in an appropriate queue
  • Available worker picks up the task from the queue
  • Worker deserializes and executes the task
  • Result is optionally stored in the result backend

Types of Queues in Celery

   #1. Default Queue: All Tasks go here unless specified otherwise
   CELERY_TASK_DEFAULT_QUEUE = 'default'

   #2 Named Queues: Route specific tasks to specific queues
   CELERY_TASK_ROUTES = {
      'myapp.tasks.send_email': {'queue': 'email'},
      'myapp.tasks.process_image': {'queue': 'media'},
      'myapp.tasks.heavy_computation': {'queue': 'compute'},
    }

   #3. Priority Queues: asks with different priorities
   CELERY_TASK_ROUTES = {
      'myapp.tasks.urgent_task': {'queue': 'high_priority', 'priority': 9},
      'myapp.tasks.normal_task': {'queue': 'default', 'priority': 5},
      'myapp.tasks.batch_task': {'queue': 'low_priority', 'priority': 1},
   }

MESSAGE BROKER

  • A message broker is the middleware that facilitates communication between your application (producer) and Celery workers (consumers).
  • It is responsible for receiving, storing, routing, and delivering messages(tasks) reliably.
  • It acts as an intermediary that:
    • Receives messages from producers (your django app).
    • Stores messages in queues until workers are ready.
    • Routes messages to appropriate queues based on routing rules.
    • Delivers messages to consumers (celery workers).
    • Ensures reliability through persistence and acknowledgements.

Supported Brokers in Celery

  1. RabbitMQ
  • A robust, feature-rich message broker built on AMQP (Advanced Message Queuing Protocol).
  • Choose RabbitMQ When:
    • Building production applications
    • Need guaranteed message delivery
    • Require advanced routing features
    • Have complex queue topologies
    • Need built-in monitoring and management
    • Reliability is critical

Installation

# Ubuntu/Debian
sudo apt-get install rabbitmq-server

# macOS
brew install rabbitmq

# Python client
pip install celery[librabbitmq]

Strengths:

  • AMQP Protocol: Industry standard with rich feature set
  • Reliability: Excellent message durability and delivery guarantees
  • Routing: Advanced routing capabilities (topic, direct, fanout, headers)
  • Management: Built-in web UI for monitoring and management
  • Clustering: Built-in clustering for high availability
  • Plugin System: Extensive plugin ecosystem

Weaknesses:

  • Resource Usage: Higher memory consumption
  • Complexity: More complex setup and configuration
  • Learning Curve: Steeper learning curve for advanced features
  1. Redis
  • An in-memory data structure store that can act as a message broker.
  • Redis is a lso feature-complete, but is more susceptible to data loss in the event of abrupt termination or power failures.

Strengths:

  • Performance: Extremely fast, in-memory operations
  • Simplicity: Easy to set up and configure
  • Versatility: Can serve as both broker and result backend
  • Memory Efficiency: Lower memory footprint than RabbitMQ
  • Clustering: Redis Cluster for horizontal scaling

Weaknesses:

  • Persistence: Less durable than disk-based brokers
  • Memory Limits: Limited by available RAM
  • Single Point of Failure: Without clustering/replication

Best For:

  • Development environments
  • High-performance applications
  • Simple use cases
  • Applications with limited infrastructure

Clone this wiki locally