I have some databricks secrets for connecting to external databases. However, I am looking to use the secrets in sql cells, or in spark.sql()
functions, and it results in ParseError
I am trying to use my databricks secrets to store a AWS RDS credentials (host/port/database/user/password)
If I pull them as secrets and use them on pyspark like this, it works:
HOST=dbutils.secrets.get(scope='sample',key='host')
PORT=dbutils.secrets.get(scope='sample',key='port')
DATABASE=dbutils.secrets.get(scope='sample',key='database')
USER=dbutils.secrets.get(scope='sample',key='username')
PASSWORD=dbutils.secrets.get(scope='sample',key='password')
jdbc_url="jdbc:postgresql://{}/{}".format(HOST,DATABASE)
conn_properties={'user':USER,'password':PASSWORD}
df=spark.read.format('jdbc')\
.option("url", jdbc_url) \
.option("query", 'SELECT * FROM some_table') \
.option("user", USER) \
.option("password", PASSWORD)\
.load()
Now, on a sql cell I want to reference those parameters as secrets as well. For example, I have this sql statement on a sql cell to connect to a table on RDS
%sql
CREATE TEMPORARY VIEW test
USING JDBC
OPTIONS (
url '<hardcoded_jdb_url_with_visible_values>',
dbtable 'some_schema.some_table',
user '<hardcoded_user_visible_values>',
password '<hardcoded_visible_password_plain_text>'
)
How can I avoid hardcoding them on a sql statement and still reference them as the secrets I pulled above like HOST
PORT
DATABASE
USER
PASSWORD
I tried this on python cell, however, it fails:
HOST=dbutils.secrets.get(scope='sample',key='host')
PORT=dbutils.secrets.get(scope='sample',key='port')
DATABASE=dbutils.secrets.get(scope='sample',key='database')
USER=dbutils.secrets.get(scope='sample',key='username')
PASSWORD=dbutils.secrets.get(scope='sample',key='password')
jdbc_url="jdbc:postgresql://{}/{}".format(HOST,DATABASE)
spark.sql(f'''
CREATE TEMPORARY VIEW test
USING JDBC
OPTIONS (
url {jdbc_url},
dbtable 'some_schema.some_table
user {USER},
password {PASSWORD}
)
'''
)
The error that I get is just ParseError
, with no more message on it
if I print(jdbc_url)
, naturally, because it's a secret, this is literally what shows in console jdbc:postgresql://[REDACTED]/[REDACTED]
You are missing quotes around your variables, should be:
spark.sql(f'''
CREATE TEMPORARY VIEW test
USING JDBC
OPTIONS (
url '{jdbc_url}',
dbtable 'some_schema.some_table',
user '{USER}',
password '{PASSWORD}'
)
''')