Search code examples
pythongoogle-cloud-platformgoogle-cloud-storagepython-clickgoogle-cloud-vertex-ai

Python click incorrectly parses arguments when called in Vertex AI Pipeline


I'm trying to run a simple Ada-boosted Decision Tree regressor on GCP Vertex AI. To parse hyperparams and other arguments I use Click for Python, a very simple CLI library. Here's the setup for my task function:

@click.command()
@click.argument("input_path", type=str)
@click.option("--output-path", type=str, envvar='AIP_MODEL_DIR')
@click.option('--gcloud', is_flag=True, help='Run as if in Google Cloud Vertex AI Pipeline')
@click.option('--grid', is_flag=True, help='Perform a grid search instead of a single run. Ignored with --gcloud')
@click.option("--max_depth", type=int, default=4, help='Max depth of decision tree', show_default=True)
@click.option("--n_estimators", type=int, default=50, help='Number of AdaBoost boosts', show_default=True)
def click_main(input_path, output_path, gcloud, grid, max_depth, n_estimators):
    train_model(input_path, output_path, gcloud, grid, max_depth, n_estimators)


def train_model(input_path, output_path, gcloud, grid, max_depth, n_estimators):
    print(input_path, output_path, gcloud)
    logger = logging.getLogger(__name__)
    logger.info("training models from processed data")
    ...

When I run it locally like below, Click correctly grabs the params both from console and environment and proceeds with model training (AIP_MODEL_DIR is gs://(BUCKET_NAME)/models)

❯ python3 -m src.models.train_model gs://(BUCKET_NAME)/data/processed --gcloud

gs://(BUCKET_NAME)/data/processed gs://(BUCKET_NAME)/models True

However, when I put this code on the Vertex AI Pipeline, it throws an error, namely

FileNotFoundError: b/(BUCKET_NAME)/o/data%2Fprocessed%20%20--gcloud%2Fprocessed_features.csv

As it is clearly seen, Click grabs both the parameter and the --gcloud option and assigns it to input_path. The print statement before that confirms it, both by having one too many spaces and --gcloud being parsed as false.

gs://(BUCKET_NAME)/data/processed  --gcloud gs://(BUCKET_NAME)/models/1/model/ False

Has anyone here encountered this issue or have any idea how to solve it?


Solution

  • I think is due the nature of arguments and options, you are mixing arguments and options although is not implicit stated in the documentation but argument will eat up the options that follow. If nargs is not allocated it will default to 1 considering everything after it follows as string which it looks like this is the case.

    nargs – the number of arguments to match. If not 1 the return value is a tuple instead of single value. The default for nargs is 1 (except if the type is a tuple, then it’s the arity of the tuple).

    I think you should first use options followed by the argument as display on the documentation page. Other approach is to group it under a command as show on this link.