Search code examples
amazon-web-servicesaws-lambdaamazon-kinesisaws-serverlessdata-stream

What's the use cases of Streams and Firehose?


I am working on an application that will read and analyze the logs of payment transactions. I know I will use Kinesis Analytics as per my requirements, which takes the input from the Data Streams and Firehose. But I am having trouble deciding which input method should I use for my system. My requirements are:

  1. It can tolerate latency, but Data shouldn't lose data.
  2. Must record all the errors in DynamoDB or S3 buckets.

Which input stream is suitable for my use case?


Solution

  • Data Streams vs Firehose

    1. Streams: Kinesis data streams is highly customizable and best suited for developers building custom applications or streaming data for specialized needs.
      • Going to write custom code
      • Real time (200ms latency for classic, 70ms latency for enhanced fan-out)
      • You must manage scaling (shard splitting/merging)
      • Data storage for 1 to 7 days, replay capability, multi consumers
      • Use with Lambda to insert data in real-time to ElasticSearch
    2. Firehose: Firehose handles loading data streams directly into AWS products for processing.
      • Fully managed, send to S3, Splunk, Redshift, ElasticSearch
      • Serverless data transformations with Lambda
      • Near real time (lowest buffer time is 1 minute)
      • Automated Scaling
      • No data storage