celeryev Queue in RabbitMQ Becomes Very Large

You can use the CELERY_EVENT_QUEUE_TTL celery option (only working with amqp), that will set the message expiry time, after which it will be deleted from the queue.


For anyone else who is running into problems with a celeryev queue becoming very large and threatening the disk space on your rabbitmq server, beware the accepted answer! Here's my suggestion. Just issue this command on your rabbitmq instance:

rabbitmqctl set_policy limit_celeryev_queues "^celeryev\." '{"max-length":1000000}' --apply-to queues

This will limit any queue beginning with "celeryev" to 1 Million entries. I did some experimenting with a stuck flower instance causing a runaway celeryev queue, and setting CELERY_EVENT_QUEUE_TTL / CELERY_EVENT_QUEUE_EXPIRES did not help control the queue size.

In my testing, I started a flower process, then SIGSTOP'ed it, and watched its celeryev queue start running away. Neither of these two settings helped at all. I confirmed SIGCONT'ing the flower process would bring the queue back to 0 rapidly. I am not certain why these two knobs didn't help, but it may have something to do with how RabbitMQ implements these two settings.

First, the Per-Message TTL corresponding to CELERY_EVENT_QUEUE_TTL only establishes an expiration time on each queue entry -- AIUI it will not automatically delete the message out of the queue to save space upon expiration. Second, the Queue TTL corresponding to CELERY_EVENT_QUEUE_EXPIRES says that it "... guarantees that the queue will be deleted, if unused for at least the expiration period". However, I believe that their definition of "unused" may be too strict to kick in for e.g. an overburdened, stuck, or killed flower process.

EDIT: Unfortunately, one problem with this suggestion is that the set_policy ... apply-to queues will only impact existing queues, and flower can and will create new queues which may overflow.