Search code examples
parquetazure-synapseazure-data-lake-gen2azure-data-factory

Can a Mapping Data Flow use a parameterized Parquet dataset?


thanks for coming in.

I am trying to develop a Mapping Data Flow in an Azure Synapse workspace (so I believe that this can also apply to ADFv2) that takes a Delta input and transforms it straight into a Parquet -formatted output, with the relevant detail of using a Parquet dataset pointing to ADLSGen2 with parameterized file system and folder, in opposition to a hard-coded file-system and folder, because this would take creating too many datasets as there are too many folders of interest in the Data Lake.

enter image description here

The Mapping Data Flow: enter image description here

As I try to use it as a Source in my Mapping Data Flows, the debug configuration (as well as the parent pipeline configuration) will duly ask for my input on those parameters, which I am happy to enter.

Then, as soon I try to debug or run the pipeline I get this error in less than 1 second:

{
"Message": "ErrorCode=InvalidTemplate, ErrorMessage=The expression 'body('DataFlowDebugExpressionResolver')?.50_DeltaToParquet_xxxxxxxxx?.ParquetCurrent.directory' is not valid: the string character '_' at position '43' is not expected."
}

RunId: xxx-xxxxxx-xxxxxx

This error message is not very specific to know where I should look.

I tried replacing the parameterized Parquet dataset with a hard-coded one, and it works perfectly both in debug and pipeline -run modes. However, this does not gets me what I need which is the ability to reuse my Parquet dataset instead of having to create a specific dataset for each Data Lake folder.

There are also no spaces in the Data Lake file system. Please refer to these parameters that look a lot like my production environment:

  • File System: prodfs001
  • Directory: synapse/workspace01/parquet/dim_mydim

Thanks in advance to all of you, folks!


Solution

  • The directory name synapse/workspace01/parquet/dim_mydim has an _ in dim_mydim, can you try replacing the underscore, or maybe you can use dimmydim to test whether it works.