Search code examples
apache-sparkpysparkhivedatabricksaws-databricks

Error in performing 'show create table' OpenCSV Serde Hive Table in Databricks


I have created a table with OpenCSV Serde in Databricks below DDL :-

    CREATE TABLE db_name.table_name( col1 String,
col2 String)
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    WITH SERDEPROPERTIES (
    "escapeChar"="\"",
    "quoteChar"="\"",
    "separatorChar"=",")
    LOCATION
    's3://s3_loc'
    TBLPROPERTIES('serialization.null.format'=''); 

Table is created without any issue but while trying to run below command :-

Show create table db_name.table_name

I'm getting below error message :-

AnalysisException: Failed to execute SHOW CREATE TABLE against table `db_name`.`table_name`, which is created by Hive and uses the following unsupported serde configuration SERDE: org.apache.hadoop.hive.serde2.OpenCSVSerde INPUTFORMAT: org.apache.hadoop.mapred.TextInputFormat OUTPUTFORMAT: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

Please suggest solution. Tried on Databricks version 6.4, 7.1 and 8.4.


Solution

  • Use the following instead:

    Show create table db_name.table_name AS SERDE
    

    From https://github.com/apache/spark/blob/master/docs/sql-migration-guide.md:

    In Spark 3.0, SHOW CREATE TABLE table_identifier always returns Spark DDL, even when the given table is a Hive SerDe table. For generating Hive DDL, use SHOW CREATE TABLE table_identifier AS SERDE command instead.