Running Multiple Celery Beat Instances in One Python Project

 Feb. 1, 2021     0 comments

In Python world Celery is a popular tool for running background tasks. It includes Celery beat that allows to run periodic tasks on some schedule. At my current work I participated in developing a system for running multiple periodic tasks. Since our technology stack is based on Python, Celery beat was a natural choice. One of the requirements to the system was granular control over individual tasks, that is, the ability to start/stop/restart each task individually. The straightforward solution was to run multiple Celery beat/worker pairs for each task, but after some googling it turned out that running multiple Celery beat instances seemed to be impossible. At least everybody said so, e.g. check this Stack Overflow discussion. If you try to do so you'll get duplicate tasks in your workers because each beat sends tasks to each worker.

I have written "seemed to be" because this is, in fact, quite possible and the solution is pretty obvious. The key is to have separate queues per each Celery beat instance so that different beat/worker pairs do not interfere with each other. Below is an oversimplified Python code example that demonstrates how to run 2 periodic tasks having different schedules with two Celery beat/worker pairs:

import os

from celery import Celery

TASK = os.getenv('TASK')
if not TASK:
    raise RuntimeError('Task name is not defined')


BEAT_CONFIG = {
    'first_task_group': {
        'run-every-5-seconds': {
            'task': 'celery_tasks.first_task',
            'schedule': 5.0,
            'args': (),
            'options': {'queue': 'first_task_queue'}
        },
    },
    'second_task_group': {
        'run-every-10-seconds': {
            'task': 'celery_tasks.second_task',
            'schedule': 10.0,
            'args': (),
            'options': {'queue': 'second_task_queue'}
        },
    },
}


app = Celery('tasks', broker='pyamqp://guest@localhost//')
app.conf.beat_schedule = BEAT_CONFIG[f'{TASK}_group']


@app.task
def first_task():
    print('I am the first task')


@app.task
def second_task():
    print('I am the second task')

Let's assume that our module with tasks is called celery_tasks.py. Then you can use the following commands to run two Celery Beat instances for each task:

TASK=first_task celery -A celery_tasks worker -l INFO -Q first_task_queue -B -s first-task.schedule
TASK=second_task celery -A celery_tasks worker -l INFO -Q second_task_queue -B -s second-task.schedule

Execute these commands in two terminals and you'll see that the two Celery Beat instances are running independently of each other without duplicating tasks in each Celery worker. Of course, the official Celery documentation does not recommend to run a beat and a worker with a single command like this in a real production environment but it's good enough for our demonstration purposes. In a real project you'd want to use something like a Supervisor process group to run a beat and the corresponding worker separately with the ability to control them as a single entity so that you can start/stop/restart each periodic task individually.

I hope you'll find this example useful for your Python project based on Celery/Celery beat.

  CeleryPython