Search code examples
bigdatadatabricksparquetazure-databricksdelta-lake

Unable to create individual delta table from delta format snappy.parquet files


I have multiple parquet files in storage account and converted all into delta format. Now, I need to save the result into individual delta table for each files.

df=spark.read.option("mergeschema","true") \
   .format("parquet").load("/mnt/testadls/.*parquet")
df.write.format("delta").save("/mnt/testadls/delta")

this dataframe will write into multiple snappy.parquet files(delta files)

Now,if I am trying to create separate delta table from individual snappy.parquet files I am not able to do it I am getting below partition error

A partition path fragment should be the form like part1=foo/part2=bar.

%sql

create table deltatable using delta location /mnt/testadls/delta/part-001-pid-5372710096-b67676465-b62f-45b5-a5c9-51626727-6264-1-c000.snappy.parquet

delta file name example = part-001-pid-5372710096-b67676465-b62f-45b5-a5c9-51626727-6264-1-c000.snappy.parquet


Solution

  • A Delta table != a Parquet file. You cannot read a single Parquet file as a Delta table. A Delta table = parquet files + the _delta_log directory. If you save all of data into one Delta table, there will be only one Delta table. And you cannot read each Parquet file separately.