Objective is to transform the data (csv files) from one S3 bucket to another S3 bucket - using Glue.
What I already tried:
Where I am stuck:
Because Glue output is asking for database output, which I don't have and don't want to use.
Is there any way I can achieve the goal without using any other DB system, just plain - S3, Glue?
Sample single CSV file, I am trying to merge
Classifier with delimeter of ";"
Crawler Configuration
Crawler Result (No schema detected)
I'm assuming that all CSV files which you want to merge have the same schema. You can write the same code in Glue which you write in local Spark deployment
Step 1: Get data from Catalog table
val datasource0 = glueContext.create_dynamic_frame.from_catalog(database = "database_name", table_name = "table_name", transformation_ctx = "datasource0")
Step 2: Convert datasource0 dynamic frame to data Frame
val df = datasource0.toDF()
Step 3: Store data frame to target s3 bucket
df.write.format("csv").mode("append").save("s3://target-s3-path/Output")