Search code examples
pysparkdatabricksazure-data-lake

How to read all files that ends with .csv only from Datalake on Databricks


I want to read files (on datalake) that ends with .csv into databricks. The file names doesn't have a defined format but the underlying data in all csvs have same schema.

I want to be able to read all the csvs at one go.

Please see the attached image for more details in the folder structure


Solution

  • What you are looking for is simply patter matching while reading the files.

    You should read the files like this:

    spark.read.format("csv").load("/mnt/some-mount-point/*.csv") 
    

    Materials: