Search code examples
google-cloud-vertex-aikubeflowkubeflow-pipelinesvertex-ai-pipeline

Configure subnetwork for vertex ai pipeline component


I have a vertex ai pipeline component that needs to connect to a database. This database exists in a VPC network. Currently my component is failing because it is not able to connect to the database, but I believe I can get it to work if I can configure the component to use the subnetwork. How do I configure the workerPoolSpecs of the component to use the subnetwork?

I was hoping I could do something like that:

preprocess_data_op = component_store.load_component('org/ml_engine/preprocess')


@dsl.pipeline(name="test-pipeline-vertex-ai")
def pipeline(project_id: str, some_param: str):
    
    preprocess_data_op(
        project_id=project_id,
        my_param=some_param,
        subnetwork_uri="projects/xxxxxxxxx/global/networks/data",
    ).set_display_name("Preprocess data")

However the param is not there, and i get

TypeError: Preprocess() got an unexpected keyword argument 'subnetwork_uri'

How do I define the subnetwork for the component?


Solution

  • From Google docs, There is no mention of how you can run a specific component on a subnetwork.

    However, it is possible to run the entire pipeline in a subnetwork by passing in the subnetwork as part of the job submit api.

    job.submit(service_account=SERVICE_ACCOUNT, network=NETWORK)