-
Notifications
You must be signed in to change notification settings - Fork 0
CELERY ‐ DISTRIBUTED TASK QUEUE
Irvine Sunday edited this page Jul 5, 2025
·
1 revision
- Celery is a simple, flexible, and reliable distributed system to process vast amounts of messages, while providing operations with tools required to maintain such a system.
- Its a distributed task queue system that allows you to execute tasks asynchronously across multiple workers. It's practically useful for handling time-consuming operations without blocking your web application.
- It's a task queue with focus on real-time processing, while supporting task scheduling.
WHY USE CELERY?
- Asynchronous Processing: Hadle long running tasks without blocking HTTP requests.
- Scalability: Work can be distributed across multiple machines.
- Reliability: Celery has built-n retry and error handling mechanisms.
- Flexibility: Celery can support various message brokers and result backends.
Architecture Overview
Web Application → Message Broker (RabbitMQ) → Celery Workers → Result Backend
- A celery system can consist of multiple workers and brokers, giving way to high availability and horizontal scaling.
- Calery is written in Python, but the protocol can be implemented in any language.
- Language interoperability can also be achieved by exposing an HTTP endpoint and having a task that requests it (webhooks)
- Celery requires a message transport to send and receive messages. The RabbitMQ and Redis broker transports are feature complete.
- Celery can run on a single machine, on multiple machines, or even across data centers.
Common Use Cases
- Email Processing: Sending bulk emails without blocking web requests
- Image Processing: Resizing, thumbnails, format conversion
- Data Analysis: Long-running calculations and reports
- API Integrations: Third-party API calls with retry logic
- Periodic Tasks: Scheduled maintenance, backups, cleanup
Installing Celery
Celery is on the Python Package Index (PyPI), so it can be installed with standard Python tools like pip: pip install celery
TASK QUEUE?
- A task queue is a data structure that holds tasks waiting to be processed, combined with a mechanism to distribute these tasks to available workers for execution.
- It is the central coordination system that enables asynchronous, distributed task processing.
Key Components
- Task
- A unit of work to be executed, defined as a pyhton function with the
@shared_taskdecorator containing the actual code logic and parameters.
- Queue
- This is a named buffer that holds tasks waiting for execution. Tasks are added to the queue and removed by workers following the First In First Out (FIFO) principle by default.
- Message Broker
- The middleware that manages queues and message delivery e.g. RabbitMQ, Redis, Amazon SQS.
- Handles task persistence, routing, and delivery guarantees.
- Worker
- A process that consumes tasks from queues and executes the task code.
- It can run on the same machine or distributed across multiple machines
Working Example
# Example of task queue flow
from celery import shared_task
@shared_task
def process_email(email_data):
# This function becomes a task
send_email(email_data)
return "Email sent"
# When called with .delay(), it goes to the queue
result = process_email.delay({"to": "user@example.com", "subject": "Hello"})Flow
- Task is created and serialized
- Message broker receives the task and places it in an appropriate queue
- Available worker picks up the task from the queue
- Worker deserializes and executes the task
- Result is optionally stored in the result backend
Types of Queues in Celery
#1. Default Queue: All Tasks go here unless specified otherwise
CELERY_TASK_DEFAULT_QUEUE = 'default'
#2 Named Queues: Route specific tasks to specific queues
CELERY_TASK_ROUTES = {
'myapp.tasks.send_email': {'queue': 'email'},
'myapp.tasks.process_image': {'queue': 'media'},
'myapp.tasks.heavy_computation': {'queue': 'compute'},
}
#3. Priority Queues: asks with different priorities
CELERY_TASK_ROUTES = {
'myapp.tasks.urgent_task': {'queue': 'high_priority', 'priority': 9},
'myapp.tasks.normal_task': {'queue': 'default', 'priority': 5},
'myapp.tasks.batch_task': {'queue': 'low_priority', 'priority': 1},
}MESSAGE BROKER
- A message broker is the middleware that facilitates communication between your application (producer) and Celery workers (consumers).
- It is responsible for receiving, storing, routing, and delivering messages(tasks) reliably.
- It acts as an intermediary that:
- Receives messages from producers (your django app).
- Stores messages in queues until workers are ready.
- Routes messages to appropriate queues based on routing rules.
- Delivers messages to consumers (celery workers).
- Ensures reliability through persistence and acknowledgements.
Supported Brokers in Celery
- RabbitMQ
- A robust, feature-rich message broker built on AMQP (Advanced Message Queuing Protocol).
- Choose RabbitMQ When:
- Building production applications
- Need guaranteed message delivery
- Require advanced routing features
- Have complex queue topologies
- Need built-in monitoring and management
- Reliability is critical
Installation
# Ubuntu/Debian
sudo apt-get install rabbitmq-server
# macOS
brew install rabbitmq
# Python client
pip install celery[librabbitmq]Strengths:
- AMQP Protocol: Industry standard with rich feature set
- Reliability: Excellent message durability and delivery guarantees
- Routing: Advanced routing capabilities (topic, direct, fanout, headers)
- Management: Built-in web UI for monitoring and management
- Clustering: Built-in clustering for high availability
- Plugin System: Extensive plugin ecosystem
Weaknesses:
- Resource Usage: Higher memory consumption
- Complexity: More complex setup and configuration
- Learning Curve: Steeper learning curve for advanced features
- Redis
- An in-memory data structure store that can act as a message broker.
- Redis is a lso feature-complete, but is more susceptible to data loss in the event of abrupt termination or power failures.
Strengths:
- Performance: Extremely fast, in-memory operations
- Simplicity: Easy to set up and configure
- Versatility: Can serve as both broker and result backend
- Memory Efficiency: Lower memory footprint than RabbitMQ
- Clustering: Redis Cluster for horizontal scaling
Weaknesses:
- Persistence: Less durable than disk-based brokers
- Memory Limits: Limited by available RAM
- Single Point of Failure: Without clustering/replication
Best For:
- Development environments
- High-performance applications
- Simple use cases
- Applications with limited infrastructure