Search code examples
amazon-web-servicesaws-lambdaamazon-redshiftamazon-kinesis-firehose

AWS Kinesis Firehose data appended together when delivering to AWS Redshift


I'm triggering a lambda to send data to Redshift through Firehose. When the lambda is triggered twice within a small period of time, say 1 minute, the data is collated. This creates an issue in loading the data to redshift and the issue is "Extra column(s) found".

eg: 1st set of data: 1,2,3,4, 2nd set of data: 5,6,7,8. Data received by Redshift: 1,2,3,45,6,7,8

After this happens, even if lambda is triggered once, no data is loaded into Redshift.

Why is this happening? How can I avoid this?

Thanks


Solution

  • This is likely due to omitting the end-of-record character from your data injecting code. End-of-record is unless changed and this indicates that this is all the data for the record. You need to have a in your data stream.

    This isn't a problem when the data comes in further apart in time because firehose only waits a fixed amount of time before sending the data it has to Redshift. In this case end-of-file is reach and end-of-record is assumed.