Due to the memory usage in the version of RabbitMQ that we were using, we performed an upgrade to our cluster to the latest version. Memory usage patterns stabilised, but this did not come without problems.
One of the queues that we were placing tasks at was not being consumed at all. Using different versions of client libraries and workers also did not help, and no matter what, the tasks stayed there and the queue kept growing without any work being picked up.
The decision was made and we rolled back to an older version of RabbitMQ. The roll back did not happen without problems itself, but thanks to our internal documentation, it was a quick recovery. During this downgrade, the following systems were affected:
We are still investigating why that particular queue kept growing without being consumed at all, and when we get any further findings we will share them with you in this space.
Thank you for your support in Transifex!