Search code examples
amazon-s3amazon-kinesisamazon-kinesis-firehoseamazon-kinesis-agent

Problem writing data to S3 with Kinesis Firehose Delivery Stream from Kinesis Data Stream source


I'm sending JSON files with the Kinesis Agent (using a Docker image) to to Kinesis Data Stream, which then acts as a source for the Kinesis Firehose Delivery Stream, which should then write the files to S3, but nothing is appearing in S3.

The JSON data flows into the Data Stream, and is visible in the monitoring as well as the agent logs:

2019-04-16 19:00:14.036+0000 6ae9843658b1 (Agent.MetricsEmitter RUNNING) com.amazon.kinesis.streaming.agent.Agent [INFO] Agent: Progress: 18947 records parsed (490492 bytes), and 18500 records sent successfully to destinations. Uptime: 900020ms

I have a small shell script which copies the JSON files into the input folder (which the agent is monitoring) at 2 second intervals. Each file is picked up by the Kinesis Agent:

2019-04-16 19:00:15.015+0000 6ae9843658b1 (FileTailer[kinesis:dev-kinesis-stream:/tmp/stream/*.json]) com.amazon.kinesis.streaming.agent.tailing.KinesisParser [INFO] KinesisParser[kinesis:dev-kinesis-stream:/tmp/stream/*.json]: Continuing to parse /tmp/stream/testfile00001.json.

However, nothing arrives in my Firehose Delivery stream or my S3 bucket.

In my firehose I've set the Buffer conditions to "1 MB or 60 seconds" and have encryption and compression disabled. This should allow the files to pass through to S3 since each file only contains a small array (file size ~1 KB).

I'm stumped and don't quite understand what else could be the reason.

Any help is appreciated!


Solution

  • So I figured this out on my own. The problem was with the IAM policies which I had defined. Basically, the firehose IAM role did not have the appropriate role policies attached to it and data was not being written to S3 (due to permission issues).