Search code examples
azure-machine-learning-service

Azure ML Studio not showing datasets under models


I registered a model in an Azure ML notebook along with its datasets. In ML Studio I can see the model listed under the dataset, but no dataset gets listed under the model. What should I do to have datasets listed under models?

  • Model listed under dataset:

model dataset

  • Dataset not listed under the model:

dataset model

  • Notebook code:
import pickle
import sys
from azureml.core import Workspace, Dataset, Model
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.utils import assert_all_finite

workspace = Workspace('<snip>', '<snip>', '<snip>')
dataset = Dataset.get_by_name(workspace, name='creditcard')
data = dataset.to_pandas_dataframe()
data.dropna(inplace=True)
X = data.drop(labels=["Class"], axis=1, inplace=False)
y = data["Class"]

model = make_pipeline(StandardScaler(), GradientBoostingClassifier())
model.fit(X, y)

with open('creditfraud_sklearn_model.pkl', 'wb') as outfile:
    pickle.dump(model, outfile)

Model.register(
    Workspace = workspace,
    model_name = 'creditfraud_sklearn_model',
    model_path = 'creditfraud_sklearn_model.pkl',
    description = 'Gradient Boosting classifier for Kaggle credit-card fraud',
    model_framework = Model.Framework.SCIKITLEARN,
    model_framework_version = sys.modules['sklearn'].__version__,
    sample_input_dataset = dataset,
    sample_output_dataset = dataset)

Solution

  • It looks like add_dataset_references() needs to be called to have datasets displayed under models:

    model_registration.add_dataset_references([("input dataset", dataset)])