I setup a Firehose stream that delivers data to my Redshift cluster. It was working for a short period but suddenly seemed to stop delivering to redshift. From my
select * from stl_query order by endtime desc limit 10;
select * from stl_load_errors order by starttime desc;
select * from stl_connection_log where remotehost like '52%' order by recordtime desc;
select * from stl_error where userid!=0 order by recordtime desc;
Running those commands does not list the most recent connections or copy. For example I see:
disconnecting session ... 52.70.63.204 ...
initiating session ... 52.70.63.204 ...
... in my connection logs but it stops after a certain time. I've tried recreating the table and the stream but it still does not list anything. All my data is being recieved in S3 however.
The other problem is that there are no error manifests in the s3 directory which indicates nothing failed.
How can I debug this?
Found the answer for my case. I had configured the redshift cluster with a VPC group. Without whitelisted access the connection attempts will not show up in stl_connection_log
. I added a entry for Firehose to the vpc group for my redshift cluster:
Custom TCP Rule, TCP, 5493, 52.70.63.192/27
Whitelisting ip's can be found at the bottom of: http://docs.aws.amazon.com/firehose/latest/dev/controlling-access.html