Search code examples
azuredouble-quotesazure-data-factory

Azure Data Factory CSV with double quotes


I have a pipeline that retrieves an FTP hosted CSV file. It is comma delimited with double quote identifiers. The issue exists where a string is encapsulated in double quotes, but the string itself contains double quotes.

string example: "Spring Sale" this year.

How it looks in the csv (followed and lead by two null columns):

"","""Spring Sale"" this year",""

SSIS handles this fine, but Data Factory wants to transform it into an extra column that isn't separated by a comma. I have removed the extra quotes on this line and it works fine.

Is there a way around this besides altering the source?


Solution

  • I got this to work using Escape character set as quote (") with the Azure Data Factory Copy Task. Screen shot:

    ADF Copy Task

    This was based on a file as per your spec:

    "","""Spring Sale"" this year",""

    and also worked as in insert into an Azure SQL Database table. The sample JSON:

    {
        "name": "DelimitedText1",
        "properties": {
            "linkedServiceName": {
                "referenceName": "linkedService2",
                "type": "LinkedServiceReference"
            },
            "type": "DelimitedText",
            "typeProperties": {
                "location": {
                    "type": "AzureBlobStorageLocation",
                    "fileName": "quotes.txt",
                    "container": "someContainer"
                },
                "columnDelimiter": ",",
                "escapeChar": "\"",
                "quoteChar": "\""
            },
            "schema": [
                {
                    "name": "Prop_0",
                    "type": "String"
                },
                {
                    "name": "Prop_1",
                    "type": "String"
                },
                {
                    "name": "Prop_2",
                    "type": "String"
                }
            ]
        }
    }
    

    Maybe the example file is too simple but it did work for me in this configuration.

    Alternately, just use SSIS and host it in Data Factory.