Search code examples
pythondatabricksazure-databricksdatabricks-python-sdk

Create a new Databricks cluster using Python SDK


I am trying to create a new Azure Databricks cluster using the Databricks Python SDK and I run into the following problem:

I need an Init Script for my cluster and when I am trying to set this property in code, I get the following error:

TypeError: 'InitScriptInfo' object is not iterable

For the cluster log conf property, same type of code works perfectly.

Create code is below. Do you have any ideas about how can I setup the InitScript in Pyhton SDK?

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.compute import AutoScale, ClusterSource, DataSecurityMode,RuntimeEngine, ClusterLogConf, DbfsStorageInfo,InitScriptInfo, WorkspaceStorageInfo
w = WorkspaceClient(
  host  = workspace_host,
  token = workspace_token
)
c = w.clusters.create_and_wait(
  cluster_name              = 'TEST',
  spark_version             = '12.2.x-scala2.12',
  node_type_id              = 'Standard_D4s_v3',
  autotermination_minutes   = 15,
  num_workers               = 2,
  autoscale                 = AutoScale(min_workers=2, max_workers=4),
  cluster_log_conf          = ClusterLogConf(DbfsStorageInfo('dbfs:/cluster-logs')),
  init_scripts              = InitScriptInfo(WorkspaceStorageInfo('/Shared/filename.sh'))
) 

Thank you


Solution

  • When I tried to replicate the issue with your code in my environment, I got the same error:

    enter image description here

    The error you're getting, 'InitScriptInfo' object is not iterable, indicates that you are trying to iterate over an object that is not iterable, which means the object does not support iteration using a loop. I modified the code as mentioned below:

     init_scripts=[InitScriptInfo(WorkspaceStorageInfo('Filepath'))]  
    

    I tried to execute the code it executed successfully without any error:

    enter image description here

    The cluster created successfully.

    enter image description here