I'm trying to run the latest version of boto3 in an AWS Glue spark job to access methods that aren't available in the default version in Glue.
To get the default version of boto3 and verify the method I want to access isn't available I run this block of code which is all boilerplate except for my print
statements:
import sys
import boto3
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
## @params: [JOB_NAME]
args = getResolvedOptions(sys.argv, ['JOB_NAME'])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)
athena = boto3.client('athena')
print(boto3.__version__) # verify the default version boto3 imports
print(athena.list_table_metadata) # method I want to verify I can access in Glue
job.commit()
which returns
1.12.4
Traceback (most recent call last): File "/tmp/another_sample", line 20, in print(athena.list_table_metadata) File "/home/spark/.local/lib/python3.7/site-packages/botocore/client.py", line 566, in getattr self.class.name, item) AttributeError: 'Athena' object has no attribute 'list_table_metadata'
Ok, as expected with an older version of boto3. Let's try and import the latest version...
I perform the following steps:
which returns
1.17.9
Traceback (most recent call last): File "/tmp/another_sample", line 20, in print(athena.list_table_metadata) File "/home/spark/.local/lib/python3.7/site-packages/botocore/client.py", line 566, in getattr self.class.name, item) AttributeError: 'Athena' object has no attribute 'list_table_metadata'
If I run this same script locally, which is running 1.17.9 I can find the method:
1.17.9
<bound method ClientCreator._create_api_method.._api_call of <botocore.client.Athena object at 0x7efd8a4f4710>>
Any ideas on what's going on here and how to access the methods that I would expect should be imported in the upgraded version?
Ended up finding a work-around solution in the AWS documentation.
Added the following Key/Value pair in the Glue Job parameters under the Security configuration, script libraries, and job parameters (optional) section of the job:
Key:
--additional-python-modules
Value:
botocore>=1.20.12,boto3>=1.17.12