Search code examples
sqlapache-sparkdatabricksazure-databricks

Query table schema from databricks query window?


I can run simple SQL queries to select data from a table, using a configured instance of azure databricks, used by the developers where I work. I would like to describe the table structure. However databricks returns an error if I try. (below). Is this because databricks queries are limited just to working with data?

Statement: DESCRIBE tablename

Error: ERROR [42000] [Microsoft][ODBC Oracle Wire Protocol driver][Oracle]ORA-00900: invalid SQL statement


Solution

  • DESCRIBE TABLE is working for tables registered in Hive metastore or for Delta tables on Databricks (Spark docs, Databricks docs) - it may work for data sources that are supporting new Catalog API. In many databases the DESCRIBE commands are on tool level, not on the SQL layer - that's why you can get the error.

    Depending on how you're accessing your database, you may just to try to call:

    df = spark.read.jdbc....
    df.printSchema()
    

    but this will be the schema as it's understood by Spark, not actual schema in the database. Another approach could be just use JDBC API to fetch the schema, similar to this answer.