Search code examples
pythonamazon-web-servicesaws-glue

how to pass a new S3 file when uploaded as a parameter to a glue python shell job


So, Whenever I upload a file, a glue job starts running. I have hard-coded the filenames in python shell and i am only able to upload the same file name which i have hard coded on the script. How to pass a S3 path/file name as a argument in glue python shell so that it takes the file name on the go and uses on the job. Is there a way where I can I achieve this. Is specifying the job parameters on the job settings the only way or do we have any library which can do it.


Solution

  • Yes, you can pass the filename as an argument to the Glue job.

    filename_string = {'--filename': 'your file'}
    response = glue.start_job_run(JobName=glue_pyspark_initial_loader, 
            Arguments=filename_string)