Search code examples
google-cloud-platformgoogle-cloud-vertex-aivertex-ai-pipeline

How to run a vertex AI pipeline across multiple projects?


In order to run a pipeline from my local machine, I do:

from google.cloud import aiplatform

aiplatform.init(project="my-project")

job = aiplatform.PipelineJob(
    display_name="display-name",
    template_path="pipeline.yaml",
    pipeline_root="gs://my-bucket"
)

This works, except access is limited to the project specified, even though my local credentials have access to multiple projects. And the pipeline requires access to multiple projects. How to grant the pipeline access to these multiple projects?

I tried omitting the project argument in aiplatform.init(), but of course then it takes the default project from my environment (still 1 singular project).


Solution

  • Assuming you want your Vertex AI pipeline in Project_A access resources (GCS bucket, model repositories) from another GCP project_B, you should consider configuring a service account with granular permission. In short, here are the steps.

    1. In project_A where the pipeline runs, create a service account and grant it roles/aiplatform.user permission.

    2. Grant the service account appropriate permissions in other projects (Project_B). For example:

       gcloud projects add-iam-policy-binding PROJECT_A_ID \
         --member="serviceAccount:SERVICE_ACCOUNT_ID@PROJECT_A_ID.iam.gserviceaccount.com" \
         --role="ROLE_NAME"
      
    3. Pass the Service account credential to the Pipeline job. There are multiple ways of using the Service account depending on how you organize your Vertex AI pipeline and jobs. Following you example, you can run the code as:

       job = aiplatform.PipelineJob(
         display_name="display-name",
         template_path="pipeline.yaml",
         pipeline_root="gs://my-bucket"
       ) 
       job.submit(service_account="custom_sa@[your_project_id].iam.gserviceaccount.com")
      

    This notebook example shows another approach of invoking with custom service account.

    1. In production env, you might want to consider another service account as the pipeline runner or impersonating your user account with permission to run the Vertex AI service account. I would expand in here since it is a complete different topic.