Currently I’m listening events from AWS Kinesis and writing them to S3. Then I query them using AWS Glue and Athena.
Is there a way to import that data, possibly with some transformation, to an RDS instance?
There are several general approaches to take with regards to that task.
- Read data from and Athena query into a custom ETL script (using a JDBC connection) and load into the database
- Mount the S3 bucket holding the data to a file system (perhaps using s3fs-fuse), read the data using a custom ETL script, and push it to the RDS instance(s)
- Download the data to be uploaded to the RDS instance to a filesystem using the AWS CLI or the SDK, process locally, and then push to RDS
- As you suggest, use AWS Glue to import the data to from Athena to the RDS instance. If you are building an application that is tightly coupled with AWS, and if you are using Kinesis and Athena you are, then such a solution makes sense.
When connecting GLUE to RDS a couple of things to keep in mind (mostly on the networking side:
- Ensure that DNS Hostnames are enabled the VPC hosting the target RDS instance
- You'll need to setup a self-referencing rule in the Security Group associated with the target RDS instance
For some examples of code targetting a relational database, see the following tutorials