Search code examples
pythongoogle-bigquerygoogle-cloud-python

Python BigQuery API - get table schema/header


Given a query example like

import uuid

from google.cloud import bigquery


def query_shakespeare():
    client = bigquery.Client()
    query_job = client.run_async_query(str(uuid.uuid4()), """
        #standardSQL
        SELECT corpus AS title, COUNT(*) AS unique_words
        FROM `publicdata.samples.shakespeare`
        GROUP BY title
        ORDER BY unique_words DESC
        LIMIT 10""")

    query_job.begin()
    query_job.result()  # Wait for job to complete.

    destination_table = query_job.destination
    destination_table.reload()
    for row in destination_table.fetch_data():
        print(row)


if __name__ == '__main__':
    query_shakespeare()

how can I get the schema of the table? row, in the previous example has the form

Row(('august', -1, 'aaa', 333), {'col1': 0, 'col2': 1, 'col3': 2})

but I can't find, for the package google-cloud-bigquery==0.28.0 the way to extract the header JSON. Of course, also the extraction of the table schema would be fine for me, but the current Google documentation looks not to work for the last version...


Solution

  • If you need the schema of the table you just queried, you can get it from the result method from the QueryJob:

    client = bq.Client()
    query = """
             #standardSQL
             SELECT corpus AS title, COUNT(*) AS unique_words
             FROM `publicdata.samples.shakespeare`
             GROUP BY title
             ORDER BY unique_words DESC
             LIMIT 10"""
    query_job = client.query(query)
    result = query_job.result()
    
    schema = result.schema
    

    Result:

    [SchemaField(u'title', u'string', u'NULLABLE', None, ()),
     SchemaField(u'unique_words', u'integer', u'NULLABLE', None, ())]
    

    (The code you presented in your question is related to version 0.27).

    As for the question of getting the header JSON, not sure if I understood it correctly but as it seems you need the schema to find where the json is located (I'm guessing here).