Search code examples
azureparquetazure-synapse

Passing "null" value into file name parameter in Azure Synapse generates a file name based on fetched tables schema.table


I have created a pipeline in Azure Synapse consisting of two steps:

  1. A Lookup activity that reads a .csv file.
  2. A Foreach activity that iterates through each value/row in the .csv file and retrieves data from a database.

In the CSV file, the first row serves as a delimiter and has the following structure: dbName, schema, src_table, dest_table

  1. dbName is the name of the database from which we retrieve data.
  2. schema is the schema of the table.
  3. src_table is the name of the table from which we want to retrieve data.
  4. dest_table is the name of the file in Blob Storage (containers) where we will store the data.

When running the Lookup activity (only reading the file content), I get the following output (where we see that we have a null value in dest_table):

Lookup Output

This value is then sent as a parameter to the sink dataset used (ADLS Parquet). See below:

Copy Data activity configuration

This Integration Dataset (Sink Dataset) is configured as follows:

Integration Dataset configuration

So, the conditions are as follows:

  1. We retrieve dbName, schema, src_table (for which we have data). We attempt to retrieve a value for dest_table, but it is empty, resulting in the value "null," for example.
  2. We fetch data from the given database/schema/table and aim to save this data in a file placed in our Azure Blob Storage.
  3. To structure where data is placed, we want the filename to be the same as "schema.table" specified in the CSV file.
  4. We send null to our Integration Dataset (parameter fileName).
  5. The filename created is still "schema.table."

How is it that when we don't provide a value for the filename, it defaults to "schema.table"? See the picture below to see the final result of the data transport and parquet file creation.

Final result of fetched data


Solution

  • As per the details you provided it is creating the file with the SchemaName.TableName when the value for filename in dataset is null.

    AFAIK It is default beg=haviour of ADF when you store the data from any database to Blob storage and don't set any name for destination file it will automatically create the file with name as Source SchemaName.TableName

    The workaround can be used as if else activity. To check the value of dest_table:

    • To check the value of dest_table if its null or not if not null use @equals(item().dest_table,null) this expression. enter image description here

    • If it is null the then you can set the name for destination file using dyanmic expression in copy activity as below @concat(item().src_table,'_desttableNULL') it will set value e.g. demo_desttableNULL.

    • if it's not null then copy activity can take destination table value from CSV file itself.