In Azure ML studio, we can build components to do different tasks in machine learning. I am creating a component that has one input: Input image folder (URI) and two output folders (URIs). The component takes images from input folder, transforms images using Pytorch and tries to save it to output folder.I am getting following error after executing command component from a yaml file.
Execution failed. User process 'python' exited with status code 1. Please check log file 'user_logs/std_log.txt' for error details.
Error: Traceback (most recent call last): File "prep_1.py", line 177, in main(args) File "prep_1.py", line 169, in main prepare_data_component(args.input_data, args.training_data, args.val_data) File "prep_1.py", line 114, in prepare_data_component image.save(save_path) File "/azureml-envs/azureml_7e9e1abac3aeb5e2560b92cd769d118a/lib/python3.7/site-packages/PIL/Image.py", line 2428, in save fp = builtins.open(filename, "w+b") OSError: [Errno 30] Read-only file system: '/mnt/azureml/cr/j/50e4xxxxxxxx25a02xxxxxx/cap/data-capability/wd/INPUT_input_data/train/chickens/trial.jpg'
I want to know how to write/save images to an output URI from a .py file executed as a command from yaml file
I am creating Azure ML components from a yaml file. Here's how a simple yaml file to create a component will look like :
$schema: https://azuremlschemas.azureedge.net/latest/commandComponent.schema.json
type: command
name: prep_image_classification_pytorch
display_name: Data Preparation for Image Classification Pytorch
inputs:
input_data:
type: uri_folder
outputs:
training_data:
type: uri_folder
val_data:
type: uri_folder
code: ./
command: python prep.py --input_data ${{inputs.input_data}} --training_data ${{outputs.training_data}} --val_data ${{outputs.val_data}}
environment:
conda_file: ./conda.yaml
image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04
The prep.py file will take arguments from given yaml file (command line arguments) which we will parse through argparse library
ef parse_args():
# setup argparse
parser = argparse.ArgumentParser()
# add arguments
parser.add_argument("--input_data", type=str, help="path of input data")
parser.add_argument("--training_data", type=str, default="./", help="output path of train data")
parser.add_argument("--val_data", type=str, default="./", help="output path of validation data")
# parse args
args = parser.parse_args()
# return args
return args
def prepare_data_component(
input_data: Input(type="uri_folder"),
training_data: Output(type="uri_folder"),
val_data: Output(type="uri_folder") ):
As shown in code above use "./default" for ouput URI folders. These folders will be created inside Azure MLs blob storage. Mentioning path of our choice inside azure blob storage did not work for me.