Search code examples
pythonapache-kafkafaust

Debugging Faust Stream Processing - Restart App from Beginning of Topic


I am debugging a simple application:

import faust

app = faust.App('app08')

# want to start from the beginning of the 
# topic every time the application restarts
@app.agent(topic) 
async def process(stream):
    async for event in stream:
        print(event)

And would like on restart of this application for the agent to read from the earliest offset. Right now, it's smart and knows the position of the last message read and starts from that position on restart. Despite scouring the docs for a while, I could not find an example of how to do this. The only way I know how to do this is to change the application name, for example: app08 to app09.


Solution

  • Keeping in mind that the offsets are controlled by the Kafka server using a consumer group with the same name as your faust app, I've been using the kafaka-consumer-groups CLI ( part of your kafka install ) to do this.

    kafka-consumer-groups --bootstrap-server kafka_bootstrap --reset-offsets --to-earliest --group faust_appname --execute --all-topics
    

    You can also replace --to-earliest with --to-datetime and provide a timestamp in the format 2020-09-20T00:00:00.00 if you have a relatively recent version of Kafka running.

    If you wanted to automate this, I'm there are Python APIs to automate control of the consumer group as well.