Search code examples
azure-data-factory

How to keep name in binary copy in Azure Data Factory on a SFTP server


I'm pretty new to ADF, I have a Blob storage configured for SFTP. On this Blob, I would like to copy files with the '.json' extension or another specific extension. I used Copy file activity and 2 binary dataset. For the source, I used a wildcard folder path and '*.json' for the file name. For the sink, I just used 'test' as the subfolder. I am able to copy them, but I cannot keep their original names. How can I do that ?

name of the files

I tried to configure by variabilising, but maybe I did it wrong. I couldn't get the name of the files being copied

Pipeline json :

{
"name": "pipeline3_A",
"properties": {
    "activities": [
        {
            "name": "Copy data1",
            "type": "Copy",
            "dependsOn": [],
            "policy": {
                "timeout": "0.12:00:00",
                "retry": 0,
                "retryIntervalInSeconds": 30,
                "secureOutput": false,
                "secureInput": false
            },
            "userProperties": [],
            "typeProperties": {
                "source": {
                    "type": "BinarySource",
                    "storeSettings": {
                        "type": "SftpReadSettings",
                        "recursive": false,
                        "wildcardFileName": "*.json",
                        "deleteFilesAfterCompletion": false,
                        "disableChunking": false
                    },
                    "formatSettings": {
                        "type": "BinaryReadSettings"
                    }
                },
                "sink": {
                    "type": "BinarySink",
                    "storeSettings": {
                        "type": "SftpWriteSettings",
                        "copyBehavior": "FlattenHierarchy",
                        "operationTimeout": "01:00:00",
                        "useTempFileRename": true
                    }
                },
                "enableStaging": false
            },
            "inputs": [
                {
                    "referenceName": "Binary1",
                    "type": "DatasetReference"
                }
            ],
            "outputs": [
                {
                    "referenceName": "Binary2",
                    "type": "DatasetReference"
                }
            ]
        }
    ],
    "variables": {
        "FileName": {
            "type": "String"
        }
    },
    "annotations": []
}

}


Solution

  • According to your pipeline Json you are choosing Flatten hierarchy as Copy behavior in your sink. That may be the reason to get your file names shown below:

    enter image description here

    Make Copy behavior None in your sink as shown below:

    enter image description here

    Then after successful run of copy activity you will get the file names as shown below:

    enter image description here

    Here is the pipeline Json for your reference:

     {
        "name": "pipeline6",
        "properties": {
            "activities": [
                {
                    "name": "Copy data1",
                    "type": "Copy",
                    "dependsOn": [],
                    "policy": {
                        "timeout": "0.12:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [],
                    "typeProperties": {
                        "source": {
                            "type": "BinarySource",
                            "storeSettings": {
                                "type": "SftpReadSettings",
                                "recursive": true,
                                "wildcardFolderPath": "ins",
                                "wildcardFileName": "*.json",
                                "deleteFilesAfterCompletion": false,
                                "disableChunking": false
                            },
                            "formatSettings": {
                                "type": "BinaryReadSettings"
                            }
                        },
                        "sink": {
                            "type": "BinarySink",
                            "storeSettings": {
                                "type": "SftpWriteSettings",
                                "operationTimeout": "01:00:00",
                                "useTempFileRename": true
                            }
                        },
                        "enableStaging": false
                    },
                    "inputs": [
                        {
                            "referenceName": "sftpsrc",
                            "type": "DatasetReference"
                        }
                    ],
                    "outputs": [
                        {
                            "referenceName": "sftpsink",
                            "type": "DatasetReference"
                        }
                    ]
                }
            ],
            "annotations": []
        }
    }