Search code examples
azureazure-data-factory

How to handle big file csv as output in Azure data Flow


I have a data flow which spits an output of 10gb csv file into blob . I have kept aas "Single partition " in the sink ,thats why its taking a lot of time to succeed and sometimes it fails. When I put "Default Partitioning " ,its working fine.But my business requirement is that there should be a filename to the output,which cant be achieved through dwfault option or any other options.So can we partition the output with custom file name like part1.csv ,part2.csv,.... and there is no col apply key partioning too. Please help


Solution

  • You can use Pattern in the File name option to achieve your requirement.

    First Go to Optimize->Set partitioning-> select the type of partitioning and Number of partitions that you want (This will be the Number of files you want).

    enter image description here

    Now, in the sink Settings, select Pattern and give the below expression in it.

    concat('part', '[n].csv')
    

    enter image description here

    This will generate the files part1.csv,part2.csv,..,Number of partitions we set.

    Result:

    enter image description here