Search code examples
google-cloud-vertex-aikubeflowkubeflow-pipelinesvertex-ai-pipeline

Migrate kubeflow docker image components to VertexAI pipeline


I am trying to migrate a custom component created in kubeflow to VertexAI. In Kubeflow I used to create components as docker container images and then load them into my pipeline as follows:

def my_custom_component_op(gcs_dataset_path: str, some_param: str):
    return kfp.dsl.ContainerOp(
        name='My Custom Component Step',
        image='gcr.io/my-project-23r2/my-custom-component:latest',
        arguments=["--gcs_dataset_path", gcs_dataset_path,
                   '--component_param', some_param],
        file_outputs={
            'output': '/app/output.csv',
        }
    )

I would then use them in the pipeline as follows:

@kfp.dsl.pipeline(
    name='My custom pipeline',
    description='The custom pipeline'
)
def generic_pipeline(project_id, some_param):
    
    output_component = my_custom_component_op(
        gcs_dataset_path=gcs_dataset_path,
        some_param=some_param
    )

    output_next_op = next_op(gcs_dataset_path=dsl.InputArgumentPath(
        output_component.outputs['output']),
        next_op_param="some other param"
    )

Can I reuse the same component docker image from kubeflow v1 in vertex ai pipeline? How can I do that? hopefully without changing anything in the component itself.

I have found examples online of vertex AI pipelines that uses the @component decorator as follows:

@component(base_image=PYTHON37, packages_to_install=[PANDAS])
def my_component_op(
    gcs_dataset_path: str,
    some_param: str
    dataset: Output[Dataset],
):
   ...perform some op....

But this would require me to copy paste the docker code in my pipeline and this is not really something I want to do. Is there a way to re-use the docker image and passing the parameters? I couldn't find any example of that anywhere.


Solution

  • You need to prepare component yaml and load it with load_component_from_file.

    It's well documented on kfp v2 Kubeflow documentation page, it's also written here.