Search code examples
google-bigquerygoogle-api-python-client

Is it possible to invoke BigQuery procedures in python client?


Scripting/procedures for BigQuery just came out in beta - is it possible to invoke procedures using the BigQuery python client?

I tried:

query = """CALL `myproject.dataset.procedure`()...."""
job = client.query(query, location="US",)
print(job.results())
print(job.ddl_operation_performed)

print(job._properties) but that didn't give me the result set from the procedure. Is it possible to get the results?

Thank you!

Edited - stored procedure I am calling

CREATE OR REPLACE PROCEDURE `Project.Dataset.Table`(IN country STRING, IN accessDate DATE, IN accessId, OUT saleExists INT64)
BEGIN
  IF EXISTS (SELECT 1 FROM dataset.table where purchaseCountry = country and purchaseDate=accessDate and customerId = accessId)
  THEN
  SET saleExists = (SELECT 1);
ELSE
  INSERT Dataset.MissingSalesTable (purchaseCountry, purchaseDate, customerId) VALUES (country, accessDate, accessId);
  SET saleExists = (SELECT 0);
END IF;
END;

Solution

  • It works if you have SELECT inside your procedure, given the procedure being:

    create or replace procedure dataset.proc_output() BEGIN
      SELECT t FROM UNNEST(['1','2','3']) t;
    END;
    

    Code:

    from google.cloud import bigquery
    client = bigquery.Client()
    query = """CALL dataset.proc_output()"""
    job = client.query(query, location="US")
    for result in job.result():
            print result
    

    will output:

    Row((u'1',), {u't': 0})
    Row((u'2',), {u't': 0})
    Row((u'3',), {u't': 0})
    

    However, if there are multiple SELECT inside a procedure, only the last result set can be fetched this way.

    Update

    See below example:

    CREATE OR REPLACE PROCEDURE zyun.exists(IN country STRING, IN accessDate DATE, OUT saleExists INT64)
    BEGIN
      SET saleExists = (WITH data AS (SELECT "US" purchaseCountry, DATE "2019-1-1" purchaseDate)
        SELECT Count(*) FROM data where purchaseCountry = country and purchaseDate=accessDate);
      IF saleExists = 0  THEN
        INSERT Dataset.MissingSalesTable (purchaseCountry, purchaseDate, customerId) VALUES (country, accessDate, accessId);
      END IF;
    END;
    BEGIN
      DECLARE saleExists INT64;
      CALL zyun.exists("US", DATE "2019-2-1", saleExists);
      SELECT saleExists;
    END
    

    BTW, your example is much better served with a single MERGE statement instead of a script.