I once had a quite similar problem while using Kafka for the database at work.
Of course, I had to try quite a lot of different things until I’ve even found the problem. Turned out to be our servers who just weren’t up to the task of handling the message logs. To combat this, I came up with the solution of transferring our whole system into a cloud.
At first this seemed like too much work, but thanks to a clever software I managed to do it just in time for annual revision. So I really recommend using a system like this: https://aiven.io/kafka
in order to save you time. Hope I could help…