Search code examples
sql-serverazureparquetdata-warehouseazure-synapse

Azure Synapse Data Flows - parquet file names not working


I have created a data flow within Azure synapse to:

  1. take data from a dedicated SQL pool
  2. perform some transformations
  3. send the resulting output to parquet files

I am then creating a View based on the resulting parquet file using OPENROWSET to allow PowerBI to use the data via the built-in serverless SQL pool

My issue is that whatever the file name I enter on the integration record, the parquet files always look like part-00000-2a6168ba-6442-46d2-99e4-1f92bdbd7d86-c000.snappy.parquet - or similar

Is there a way to have a fixed filename which is updated each time the pipeline is run, or alternatively is there a way to update the parquet file to which the View refers each time the pipeline is run, in an automated way.

Fairly new to this kind of integration, so if there is a better way to acheive this whole thing then please let me know

enter image description here


Solution

  • Azure Synapse Data Flows - parquet file names not working

    • I repro'd the same and got the file name as in below image. enter image description here

    In order to have the fixed name for sink file name,

    • Set Sink settings as follows
    File name Option: Output to single file
    Output to single file: tgtfile (give the file name)
    

    enter image description here

    • In optimize, Select single partition.

    enter image description here

    Filename is as per the settings

    enter image description here