Search code examples
apache-spark-sqlazure-databricks

REPLACE TABLE AS SELECT is not working with parquet whereas it works fine for delta


I am working on Azure Databricks, with Databricks Runtime version being - 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12). I am facing the following issue.

Suppose I have a view named v1 and a database f1_processed created from the following command

CREATE DATABASE IF NOT EXISTS f1_processed
LOCATION "abfss://processed@formula1dl679student.dfs.core.windows.net/"

Then if I run the following command it runs fine.

CREATE OR REPLACE TABLE f1_processed.circuits
AS
SELECT * FROM v1;

However, if I specify the format like in the following code

CREATE OR REPLACE TABLE f1_processed.circuits
USING PARQUET
AS
SELECT * FROM v1;

An error is thrown

[UNSUPPORTED_FEATURE.TABLE_OPERATION] The feature is not supported: 
Table `spark_catalog`.`f1_processed`.`circuits` does not support REPLACE TABLE AS SELECT. 
Please check the current catalog and namespace to make sure the qualified table name is expected, 
and also check the catalog implementation which is configured by "spark.sql.catalog". SQLSTATE: 0A000

As seen from the first command, REPLACE TABLE AS SELECT is supported and the error is wrong. Any help is appreciated.


Solution

  • The CREATE OR REPLACE command is only supported for DELTA tables. It is specified in the documentation of databricks here. It says as follows.

    REPLACE

    If specified replaces the table and its content if it already exists. This clause is only supported for Delta Lake tables.

    The error might not say directly say this but it is subtly hinting to the parquet format of the table. The statement Table 'spark_catalog'.'f1_processed'.'circuits' does not support REPLACE TABLE AS SELECT. says that since the table is created in parquet format, it cannot use the REPLACE... command.