Given that I have a Delta table in Azure storage:
wasbs://[email protected]/mydata
This is available from my Databricks environment. I now wish to have this data available through the global tables, automatically loaded to all clusters and visible in the "Data" section.
I could easily do this through copying:
spark.read\
.load("wasbs://[email protected]/mydata")\
.write.saveAsTable("my_new_table")
But this is expensive, and I would need to run it occasionally (structured streaming would help, however). But is it possible to register the source as a global table directly, without having to copy all files?
You can use CREATE TABLE USING statement in a databricks notebook cell:
%sql
CREATE TABLE IF NOT EXISTS default.my_new_table
USING DELTA
LOCATION "wasbs://[email protected]/mydata"
Table my_new_table should appear in your default database in databricks data tab.