Here's my requirements. Every day i'm receiving a CSV file into an S3 bucket. I need to partition that data and store it into Parquet to eventually map a Table. I was thinking about using AWS lambda function that is triggered whenever a file is uploaded. I'm not sure what are the steps to do that.
There are (as usual in AWS!) several ways to do this, the 2 first ones that come to me first are:
.csv
file, by going to the function configuration in the Console, and in the Designer section adding a trigger, then choose S3 and the actions you want to use, eg bucket, event type, prefix, suffix.In both cases, you will need to write the lambda function in either case to do the work you have described, and it will need IAM access to the bucket to pull the files and process them.