So I am trying to build a weekly Data import to Unity Catalog in Databricks. I use python.
There is no problem overwriting the table in case it exists:
%sql
Use catalog some_catalog
dfTarget #some pandas dataframe
df_sparkTarget=spark.createDataFrame(dfTarget)
df_sparkTarget.write.format("delta").mode("overwrite").saveAsTable(database+"."+table)
But before I ovverwrite anything I would like to check for the existence of this table:
if spark.catalog.tableExists( database+"."+table):
print("Table exists")
else:
print("Table does not exist")
This returns the following error
py4j.security.Py4JSecurityException: Method public boolean org.apache.spark.sql.internal.CatalogImpl.tableExists(java.lang.String) is not whitelisted on class class org.apache.spark.sql.internal.CatalogImpl
Where can I whitelist this? Maybe just for this one cluster and not for all.
Yes I have found much the same for now until there is documented ways to do this I am using this. Catalog functionality only seems to work for the hivemetastore
def schema_exists(catalog:str, schema_name:str):
query = spark.sql(f"""
SELECT 1
FROM {catalog}.information_schema.schemata
WHERE schema_name = '{schema_name}'
LIMIT 1""")
return query.count() > 0
def table_exists(catalog:str, schema:str, table_name:str):
query = spark.sql(f"""
SELECT 1
FROM {catalog}.information_schema.tables
WHERE table_name = '{table_name}'
AND table_schema='{schema}' LIMIT 1""",
)
return query.count() > 0
schema_exists("catalog_name","schema_name")
table_exists("catalog_name","schema_name", "table_name")