Search code examples
google-cloud-functionsvertexgoogle-cloud-scheduler

Permission denied when running scheduling Vertex Pipelines


I wish to schedule a Vertex Pipelines and deploy it from my local machine for now.
I have defined my pipeline which runs well I deploy it using: create_run_from_job_spec, on AIPlatformClient running it once.
When trying to schedule it with create_schedule_from_job_spec, I do have a Cloud Scheduler object well created, with a http endpoint to a Cloud Function. But when the scheduler runs, it fails because of Permission denied error. I used several service accounts with owner permissions on the project. Do you know what could have gone wrong?

Since AIPlatformClient from Kubeflow pipelines raises deprecation warning, I also want to use PipelineJob from google.cloud.aiplatform but I cant see any direct way to schedule the pipeline execution.


Solution

  • I've spent about 3 hours banging my head on this too. In my case, what seemed to fix it was either:

    • disabling and re-enabling cloud scheduler api. Why did I do this? There is supposed to be a service account called service-[project-number]@gcp-sa-cloudscheduler.iam.gserviceaccount.com. If it is missing then re-enabling API might fix it
    • for older projects there is an additional step: https://cloud.google.com/scheduler/docs/http-target-auth#add

    Simpler explanations include not doing some of the following steps

    • creating a service account for scheduler job. Grant cloud function invoker during creation
    • use this service account (see create_schedule_from_job_spec below)
    • find the (sneaky) cloud function that was created for you it will be called something like 'templated_http_request-v1' and add your service account as a cloud function invoker
    response = client.create_schedule_from_job_spec(
    job_spec_path=pipeline_spec,
    schedule="*/15 * * * *",
    time_zone="Europe/London", 
    parameter_values={},
    cloud_scheduler_service_account="<your-service-account>@<project_id>.iam.gserviceaccount.com"
    )
    

    If you are still stuck, it is also useful to run gcloud scheduler jobs describe <pipeline-name> as it really helps to understand what scheduler is doing. You'll see cloudfunction url, POST payload which is some base64 encoded and contains pipeline yaml and you'll see that it is using OIDC/service account for security. Also useful is to view the code of the 'templated_http_request-v1' cloud function (sneakily created!). I was able to invoke the cloudfunction from POSTMAN using the payload obtained from scheduler job.