So I have a AWS Kinesis stream where I publish events for multiple consumers. It is important for most of them to receive hot data - which means that many of them will possibly poll and read the latest data at the same time. According to the AWS documentation increasing the number of shards will increase the level of parallelism while the number of reads/sec can be max 5/sec per shard. My question is whether (and how?) would adding more shards help the situation where all my consumers are up-to-date and attempt to read new incoming data from the same shard? It seems to be that this reads per sec limitation automatically introduces a limitation on the number of consumers you can have (at least when they need to be updated at all times), or am I missing something?
Yes you are right.
In the consumers, I assume you'll use Amazon Kinesis Client (or KCL: amazon-kinesis-client) as API helper; and please take a look that there is a parameter "idleTimeBetweenReadsInMillis" in the consumer logic. That defines how much your application will poll the stream (the lower this value, more frequent your apps will poll).
Whether your stream contains 1 shard or 100 shards, you cannot make more than 5 "GetRecords" requests per second for each shard. That is;
You can also create a Kafka cluster for yourself and benchmark their performance. Kafka may give higher throughput.
See this answer for sample comparison between Kafka and Kinesis concepts: Kafka like offset on Kinesis Stream?