Search code examples
apache-sparkodbcdatabricksazure-databricksdatabricks-unity-catalog

Give Databricks Unity Catalog enabled cluster user root privileges


I'm migrating the current hive metastore tables in my Azure Databricks workspace to Unity Catalog (UC), and I encountered and issue related to privileges with the new user.

So my cluster setting will be something like:

  • DBR 13.3 LTS
  • Mode: Shared (UC enabled)
  • Cluster config has an init_script which runs apt-get and other commands in order to install the ODBC driver

So noticed that if I run: %sh whoami in the non-UC cluster I would get root as a response, which was great because the init_script would run the apt-get and commands to install the ODBC driver, but now I run the same %sh whoami in the UC enabled cluster I will get something like spark-XXXXXXXX-XXXX-XXXX-XXXX-af.

Then I run %sh id spark-XXXXXXXX-XXXX-XXXX-XXXX-af and I get something like:

uid=1119(spark-XXXXXXXX-XXXX-XXXX-XXXX-af) gid=1119(spark-XXXXXXXX-XXXX-XXXX-XXXX-af) groups=1119(spark-XXXXXXXX-XXXX-XXXX-XXXX-af)

With the non-UC enabled cluster I would get something like:

uid=0(root) gid=0(root) groups=0(root)

I tried running commands like:

%sh usermod -u 0 spark-XXXXXXXX-XXXX-XXXX-XXXX-af

or

%sh visudo spark-XXXXXXXX-XXXX-XXXX-XXXX-af ALL=(ALL) NOPASSWD: ALL

To see if I can grant the user spark-XXXXXXXX-XXXX-XXXX-XXXX-af same privileges as root before in the non-UC cluster, but I was unsuccessful.

Does anyone has faced this issue, and got a solution to this? Either by granting root privileges to the UC-enabled user or any way of installing ODBC in a UC-enabled cluster.


Solution

  • When you run %sh whoami command on the Shared UC cluster, it's executed in the isolated environment that is necessary to protect users from each other, so no commands like visudo, etc. won't help.

    But the init script is still executed as root - you can check this by creating a simple init script with content like this:

    #!/bin/bash
    
    whoami
    

    and if you enable cluster logs, then you should see in the file for the standard output that it has the root there.