Search code examples
amazon-web-servicesboto3amazon-athena

Get Scanned data with boto3 on Athena


I use Boto3 to perform Athena queries. My code looks like this:

athena_client = boto3.client('athena')
    
# start the query 
query_execution = athena_client.start_query_execution(
    QueryString=sql_query,
    ResultConfiguration={ 'OutputLocation': 's3://my_path'}
)

# Get the id of the query
query_execution_id = query_execution['QueryExecutionId']

query_status = None
while query_status != 'SUCCEEDED':
    time.sleep(1)
    query_status = athena_client.get_query_execution(QueryExecutionId=query_execution_id)['QueryExecution']['Status']['State']
    if query_status not in ['QUEUED', 'RUNNING', 'SUCCEEDED']:
        raise Exception(f"""
        Athena query with the query execution ID {query_execution_id} failed or was cancelled.
        status: {query_status}
        """)
                
query_result = athena_client.get_query_results(QueryExecutionId=query_execution_id)

When I use the query Editor on AWS console Athena, I get metada about the query I performed. I would like to get the field Data scanned: scanned data info in metadata

When I look at the response I get (the variable query_result in my code), I have a field called ResponseMetadata but it does not contains the scanned data value. Is there a way to get it with boto3 ?


Solution

  • The Amazon Athena get_query_runtime_statistics() command:

    Returns query execution runtime statistics related to a single execution of a query if you have access to the workgroup in which the query ran.

    There is a field called InputBytes, which is defined as:

    The number of bytes read to execute the query.