Can a Kafka consumer scale down to zero?

I am more familiar with SNS/SQS and Lambda than with Kafka.

Lambda sleeps after some time when there are no requests. If a message is dropped to SQS it will trigger the lambda to "wake up". This way messaging + pay-per-use micro-service is achieved.

With Kafka, there are consumers & producers. The consumer is always listening, thus always up & running either a container in Kubernetes or other serverless platforms like Fargate, ECS, App Runner etc.

How to achieve the following scaling behavior with a Kafka consumer:

If there are no requests, the micro-service should scale down to zero
But if a message is dropped to Kafka during that period, it should wake it up.

Do i need a always running/up micro-service to listen to Kafka, or this there another way to achieve this.

Solution

Do i need a always running/up micro-service to listen to Kafka, or this there another way to achieve this.

Yes, listening to Kafka is a "pull" mechanism, the only way to know that some new record is available in a topic is to leave at east one member of the consumer group (i.e. one instance of your application) up and running. This indeed prevents to scale the consumer down to zero while being able to preserve low-latency reaction to new events.

Kafka is thought out for a very large volume of events that can be consumed in parallel by one or several distributed application(s) => it's expected that the application listening to a topic is able to scale up to many instances of itself.

If that's the case, then you can approach the outcome you're describing by hooking the scalability of your app to the inbound Kafka traffic, letting it scale down to one single instance to which you allocate the least amount of resource possible. This one will guarantee that you react to traffic as soon as it arrives, and if traffic increases the auto-scaler should let it scale up again to many instances.

Note that Kafka really shines at broadcasting a large quantity of events to the rest of the ecosystem with long-term persistence, specific ordering guarantees and a large choice of data-integrations. It's less appropriate for point-to-point message passing, especially if you need to scale-to-zero. AWS SQS might be more suitable in that case.