apache-spark pyspark apache-spark-sql databricks parquet

Databricks: Incompatible format detected (temp view)

I am trying to create a temp view from a number of parquet files, but it does not work so far. As a first step, I am trying to create a dataframe by reading parquets from a path. I want to load all parquet files into the df, but so far I dont even manage to load a single one, as you can see on the screenshot below. Can anyone help me out here? Thanks Info: batch_source_path is the string in column "path", row 1

Solution

Your data is in Delta format and this is how you must read:

data = spark.read.load('your_path_here', format='delta')