Search code examples
google-bigquerygoogle-cloud-vertex-ai

Vertex AI Pipeline is failing while trying to get data from BigQuery


I am trying to run a Google Vertex AI pipeline to query from a BigQuery table. In the pipeline, I am using the right project and the service account(which has bigquery.jobs.create access). But I see when it runs, it is accessing another project e1cd7306fb577e88gq-uq. I am not able to figure out where from this project is coming from. I am running the pipeline from Vertex AI user managed notebook

pandas_gbq.exceptions.GenericGBQException: Reason: 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/e1cd7306fb577e88gq-uq/jobs?prettyPrint=false: Access Denied: Project e1cd7306fb577e88gq-uq: User does not have bigquery.jobs.create permission in project e1cd7306fb577e88gq-uq.

Solution

  • The service agent or service account running your code does have the required permission, but your code is trying to access a resource in the wrong project. Due to the way Vertex AI runs your training code, this problem can occur inadvertently if you don't explicitly specify a project ID or project number in your code.

    You can explicitly select the project you want this way:

    import os
    
    from google.cloud import bigquery
    
    project_number = os.environ["CLOUD_ML_PROJECT_ID"]
    
    client = bigquery.Client(project=project_number)
    

    You can read more about training code requirements here.