Search code examples
pysparkdatabricksamazon-rds

Databricks use sql jdbc parameters from secrets results in ParseError


I have some databricks secrets for connecting to external databases. However, I am looking to use the secrets in sql cells, or in spark.sql() functions, and it results in ParseError

I am trying to use my databricks secrets to store a AWS RDS credentials (host/port/database/user/password)

If I pull them as secrets and use them on pyspark like this, it works:

HOST=dbutils.secrets.get(scope='sample',key='host')
PORT=dbutils.secrets.get(scope='sample',key='port')
DATABASE=dbutils.secrets.get(scope='sample',key='database')
USER=dbutils.secrets.get(scope='sample',key='username')
PASSWORD=dbutils.secrets.get(scope='sample',key='password')

jdbc_url="jdbc:postgresql://{}/{}".format(HOST,DATABASE)
conn_properties={'user':USER,'password':PASSWORD}

df=spark.read.format('jdbc')\
    .option("url", jdbc_url) \
    .option("query", 'SELECT * FROM some_table') \
    .option("user", USER) \
    .option("password", PASSWORD)\
    .load()

Now, on a sql cell I want to reference those parameters as secrets as well. For example, I have this sql statement on a sql cell to connect to a table on RDS

%sql 
CREATE TEMPORARY VIEW test
USING JDBC
OPTIONS (
  url '<hardcoded_jdb_url_with_visible_values>',
  dbtable 'some_schema.some_table',
  user '<hardcoded_user_visible_values>',
  password '<hardcoded_visible_password_plain_text>'
)

How can I avoid hardcoding them on a sql statement and still reference them as the secrets I pulled above like HOST PORT DATABASE USER PASSWORD

I tried this on python cell, however, it fails:

HOST=dbutils.secrets.get(scope='sample',key='host')
PORT=dbutils.secrets.get(scope='sample',key='port')
DATABASE=dbutils.secrets.get(scope='sample',key='database')
USER=dbutils.secrets.get(scope='sample',key='username')
PASSWORD=dbutils.secrets.get(scope='sample',key='password')

jdbc_url="jdbc:postgresql://{}/{}".format(HOST,DATABASE)

spark.sql(f'''
CREATE TEMPORARY VIEW test
USING JDBC
OPTIONS (
  url {jdbc_url},
  dbtable 'some_schema.some_table
  user {USER},
  password {PASSWORD}
)
'''
)

The error that I get is just ParseError, with no more message on it

if I print(jdbc_url), naturally, because it's a secret, this is literally what shows in console jdbc:postgresql://[REDACTED]/[REDACTED]


Solution

  • You are missing quotes around your variables, should be:

    spark.sql(f'''
    CREATE TEMPORARY VIEW test
    USING JDBC
    OPTIONS (
      url '{jdbc_url}',
      dbtable 'some_schema.some_table',
      user '{USER}',
      password '{PASSWORD}'
    )
    ''')