Search code examples
azure-data-factoryparquetdatabricks-sql

Databricks SQL reading Parquet file ceated by Copy activity in Azure Data Factory


I'm tyring to get a Parquet file written by a Azore Data Factory copy activity "attached" to Databricks SQL. I used this command

create table people_db.GLAccount USING PARQUET LOCATION "abfss://dlsxxxx.dfs.core.windows.net/datamesh/PricingAnalysis/rdv_60_134.vGLAccount.parquet"

But I'm getting this error

Abfss://dlsxxxx.core.windows.net/datamesh/PricingAnalysis/rdv_60_134.vGLAccount.parquet has invalid authority.

I'm not a databricsk expert and I saw that the ADF-Copy writes all data into 1 file and when I create Parquet files in Databricks I'm getting a directory with several files into it. So this might all have something to do with setting, but I will need some guidance on where what to change/test...

Kr, Harry


Solution

  • Most probably, the datamesh is your container on ADLS? If yes, then URL should be

    abfss://[email protected]/PricingAnalysis/rdv_60_134.vGLAccount.parquet
    

    In general, ABFSS urls look like following:

    abfss://<container>@<storage>.dfs.core.windows.net/<path>
    

    P.S. you can find all details in documentation for Hadoop ABFSS integration.