I need to use a newer boto3 package for an AWS Glue Python3 shell job (Glue Version: 1.0).
The default version is very old and hence all the API's does not work
For eg pause_cluster() and resume_cluster() does not work in AWS Glue Python Shell due to this older version
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/redshift.html
Similarly for many other product features.
Additionally, we don't have to Glue internet access by security and hence need a solution based on s3 storage libraries
Which is the best way to upgrade the Python Shell as it seems the lightest weight part of our architecture
Basically, we are using python glue shell as our core workflow engine to asynchronously deisgn our pipeline through boto3 apis
Hi, We got AWS Glue Python Shell working with all dependencies as follows. The Glue has awscli dependency as well along with boto3
Add awscli and boto3 whl files to Python library path during Glue Job execution. This option is slow as it has to download and install dependencies.
Reference: AWS Wrangler Glue dependency build
colorama==0.4.3 docutils==0.15.2 rsa==4.5.0 s3transfer==0.3.3 PyYAML==5.3.1 botocore==1.19.23 pyasn1==0.4.8 jmespath==0.10.0 urllib3==1.26.2 python_dateutil==2.8.1 six==1.15.0
pip download -r requirements.txt -d libs
cd libs zip ../boto3-depends.zip *
Upload the boto3-depends.zip to s3 and add the path to Glue jobs Referenced files path Note: It is Referenced files path and not Python library path
Placeholder code to install latest awcli and boto3 and load into AWS Python Glue Shell.
import os.path import subprocess import sys # borrowed from https://stackoverflow.com/questions/48596627/how-to-import-referenced-files-in-etl-scripts def get_referenced_filepath(file_name, matchFunc=os.path.isfile): for dir_name in sys.path: candidate = os.path.join(dir_name, file_name) if matchFunc(candidate): return candidate raise Exception("Can't find file: ".format(file_name)) zip_file = get_referenced_filepath("awswrangler-depends.zip") subprocess.run() # Can't install --user, or without "-t ." because of permissions issues on the filesystem subprocess.run(, shell=True) #Additonal code as part of AWS Thread https://forums.aws.amazon.com/thread.jspa?messageID=954344 sys.path.insert(0, '/glue/lib/installation') keys = for k in keys: if 'boto' in k: del sys.modules[k] import boto3 print('boto3 version') print(boto3.__version__)