Search code examples
azurepowershellazure-databricks

Init Script on databricks


Life of the Init script from DBFS has come to an end. I used to deploy the databricks cluster using this powershell script with InitScript paramter.

New-DatabricksCluster -BearerToken $ADB_Token -Region $region -ClusterName $cname -SparkVersion $csparkV `
-NodeType $cnodeT -MinNumberOfWorkers $cminWorker -MaxNumberOfWorkers $cmaxWorker -AutoTerminationMinutes $cterm `
-InitScripts "/Init/pyodbc.sh" -UniqueNames -Update

But oficial documentation of this powershell function states that it must but DBFS path.

.PARAMETER InitScripts Init scripts to run post creation. As array of strings - paths must be full dbfs paths. Example: "dbfs:/script/script1", "dbfs:/script/script2"

Now I want to migrate from DBFS to Workspace file location, while I can achieve this via databricks UI, i was wondering how can I do the same thing with the powershell code or even databricks API.


Solution

  • Unfortunately the module you are using doesn't give the option to set workspace type init script.

    So, try below module.

    1. Installing
    Install-Module -Name DatabricksPS
    
    1. Next, set the environment.
    $accessToken = "dapi12345sxsdksancldkcna7c51"
    $apiUrl = "https://westeurope.azuredatabricks.net"
    
    Set-DatabricksEnvironment -AccessToken $accessToken -ApiRootUrl $apiUrl
    
    1. Create cluster with below commands.
    $init_scripts = @( @{ "workspace" = @{ "destination" = "/Users/<user_id>/init.sh"; }; } )
    Add-DatabricksCluster -NumWorkers 2 -ClusterName "MyCluster" -SparkVersion "4.0.x-scala2.11" -NodeTypeId 'Standard_DS3_v2'  -InitScripts $init_scripts
    

    Output:

    enter image description here

    and in portal

    enter image description here

    refer more about this module here.