Search code examples
azureazure-machine-learning-serviceazure-sdk-python

Azure Machine Leraning Compute Cluster User Assigned Identities


I'm having a hard time to understand how User assigned identity works on Compute Clusters on Compute Clusters.

Today, I have a Compute Instance with a User assigned identity that will connect to other Azure services likes CosmosDB, Databricks and much more. So the User Identity has RBAC roles to it and also a SP made inside Databricks since it can not be synced with ADD.

So this work correclty, but when I want to launch a compute cluster from my compute instance, I user Azure SDK v2 to launch it. I tried to add a UserIdentityConfiguration() to the command but when checking on the raw yml file configuration of the compute cluster, I see that the Identity is set to null

"runDefinition": {
        "script": null,
        "command": "python main.py",
        "useAbsolutePath": false,
        "arguments": [],
        "sourceDirectoryDataStore": null,
        "framework": "Python",
        "communicator": "None",
        "target": "cpu-2-16",
        "dataReferences": {},
        "data": {},
        "outputData": {},
        "datacaches": [],
        "jobName": null,
        "maxRunDurationSeconds": null,
        "nodeCount": 1,
        "instanceTypes": [],
        "priority": null,
        "credentialPassthrough": true,
        "identity": null,

I based my code like this example repo: https://github.com/MicrosoftDocs/azure-docs/blob/211d3450211c26e95c6f40b4e01bd3adf5077774/articles/machine-learning/how-to-use-serverless-compute.md?plain=1#L93

credential = DefaultAzureCredential()
# Get a handle to the workspace. You can find the info on the workspace tab on ml.azure.com
ml_client = MLClient(
    credential=credential,
    subscription_id="<Azure subscription id>", 
    resource_group_name="<Azure resource group>",
    workspace_name="<Azure Machine Learning Workspace>",
)
job = command(
    command="echo 'hello world'",
    environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
    identity=UserIdentityConfiguration(),
)
# submit the command job
ml_client.create_or_update(job)

In the definition of the compute cluster, I do have the User Assigned Identity: enter image description here

Also in my compute instance, I have the same User Assigned Identity (AZGRE........)

So how do I make my compute cluster take the identity of my compute instance? Or even take the Identity of the user running the code if he did an az login ?


Solution

  • According to this documentation UserIdentityConfiguration Passthrough your Microsoft Entra identity, that is the reason you get null in raw json defination but if you check in yaml defination you can see the UserIdentity

    enter image description here.

    So, you need to use ManagedIdentityConfiguration which is assigned while creating ml workspace itself.

    enter image description here

    Try below.

    job = command(
        command="echo 'hello world'",
        environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu@latest",
        identity=ManagedIdentityConfiguration(),
    )
    

    Output:

    In raw json

    enter image description here

    In yaml

    enter image description here

    Make sure you required roles and permission.

    EDIT

    command_job = command(
        code="./src",
        command="python main.py --iris-csv ${{inputs.iris_csv}} --learning-rate ${{inputs.learning_rate}} --boosting ${{inputs.boosting}}",
        environment="AzureML-lightgbm-3.2-ubuntu18.04-py37-cpu@latest",
        inputs={
            "iris_csv": Input(
                type="uri_file",
                path="https://azuremlexamples.blob.core.windows.net/datasets/iris.csv",
            ),
            "learning_rate": 0.9,
            "boosting": "gbdt",
        },
        compute="cpu-cluster",
        identity=ManagedIdentityConfiguration(<client_id>)
    )
    

    As given in the above code you give compute option mentioning your new cluster created with user identity.

    Learn more about command job here