Search code examples
amazon-s3aerospikeaws-glue

Can I use aws-glue to load data into aerospike?


I am designing an application which should read a txt file from S3 every 15 min, parse the data separated by | and load this data into aerospike cluster in 3 different aws regions. The file size can range from 0-32 GB and the no of records it may contain is between 5-130 million.

I am planning to deploy a custom Java process in every aws region which will download a file from S3 and loads into aerospike using multiple threads.

I just came across aws glue. Can anybody tell me if I can use aws glue to load this big chunk of data into aerospike? or any other recommendation to set up an efficient and performant application?

Thanks in advance!


Solution

  • AWS Glue does an extract, transform then loads into RedShift, EMR or Athena. You should take a look at AWS Data Pipeline instead, using the ShellCommandActivity to run your s3 data through extraction and transformation and writing the transformed data to Aerospike.