It would appear Databricks now require you to include the 'Catalog Name' when creating a Delta Table(well, at least in the way that I want to create a Delta Table).
Without going through the whole code, the following code will create a database and delta table in traditional Databricks with Hive_metastore called 'baseapm.securelogin'
deltadf = DeltaTable.forName(spark,f"{stageName}{regName}")
stageName = base
regName = apm.securelogin
Combining those two variables will output: baseapm.securelogin
Databricks appear to have made some changes and it would appear you now have to create the delta tables inside the Catalog Name in which case I need to modify the code above to include the Catalog Name.
My Catalog Name is called liveend2end
.
Therefore, I need to modify my code above such that it outputs the following: liveend2end.baseapm.securelogin
I tried the following.
catName = 'liveend2end'
deltadf = DeltaTable.forName(spark,f"{catName}{stageName}{regName}")
But all I got was liveend2endbaseapm.securelogin
However, the output I need is: liveend2end.baseapm.securelogin
Could someone please show me how to modify the code so that I get liveend2end.baseapm.securelogin
?
I also tried the following to add a period between catName
deltadf = DeltaTable.forName(spark,f"{catName}'.'{stageName}{regName}")
But I got the error:
ParseException: [PARSE_SYNTAX_ERROR] Syntax error at or near ''.''.
I have been shown how to add a period in the past, and I remember is that it was quite simple ... but I just can't remember how to place the period at the moment ....
One tip is to remember that catalogs, schemas, and tables should never have periods in their names; the period is the delimiter that informs the Spark catalog how to parse the fully qualified table name: <catalog>.<schema>.<table>
. Naming your variables in a way that aligns 1-to-1 with each of those 3 layers may help reduce confusion.
In the Op's question, these can be correctly identified as:
liveend2end
baseapm
securelogin
So the desired fully qualified table name, or expected output should be liveend2end.baseapm.securelogin
.
Now.. solutions:
Using the OP's existing approach, which was only slightly off.
catName = 'liveend2end'
stageName = 'base'
regName = 'apm.securelogin'
table = f"{catName}.{stageName}{regName}"
deltadf = DeltaTable.forName(spark, table)
USE CATALOG ...
Similar, but this sets the Spark Session to use the specified catalog for any subsequent queries. This may be simpler if you want to minimize code changes.
catName = 'liveend2end'
stageName = 'base'
regName = 'apm.securelogin'
# Use Catalog
spark.sql(f"USE CATALOG {catName}")
# Remaining code is not changed:
table = f"{stageName}{regName}"
deltadf = DeltaTable.forName(spark, table)
Similar to option #2 but this requires zero code change and instead sets the default catalog using a Spark configuration which you would add to your cluster.
spark.databricks.sql.initial.catalog.name=liveend2end
# No code changes
stageName = 'base'
regName = 'apm.securelogin'
table = f"{stageName}{regName}"
deltadf = DeltaTable.forName(spark, table)
Note: while it's highly recommended to upgrade to catalogs in Unity Catalog, hive_metastore is still a valid type of catalog. All the above options can be used with
catName = 'hive_metastore'
as well.