Search code examples
azurecmdunzipazure-data-factoryazure-batch

How do I Unzip and Execute a Batch Service job as part of Azure Data Factory


Azure Data Factory can execute custom activities as Batch Service Jobs. These jobs can run from an .exe (and associated dependencies) in a storage account which are copied across prior to execution.

There is a limitation on the files in the storage account that can be used:

Total size of resourceFiles cannot be more than 32768 characters

The solution appears to be to zip the files in the storage account and unzip as part of the command. This post suggests running the Batch Service Command in Azure Data Factory as:

Unzip.exe [myZipFilename] && MyExeName.exe [cmdLineArgs]

Running this locally on a Windows 10 machine works fine. Setting this as the Command parameter on the batch service custom activity (using a Cloud Services Windows Server 2019 OS Image App Pool) results in:

caution: filename not matched: &&

It feels like something basic that I'm missing but I've tried various permutations and cannot get it to work.


Solution

  • Without full knowledge of the context in which ADF runs Custom Activity Commands on a Windows Batch Service Node I changed my setup to avoid expecting Unzip.exe to exist (which it appears not to when running under cmd /c "Unzip.exe" rather than with just Unzip.exe as the command).

    Now my storage account contents backing the custom activity has:

    • executable.zip (my .NET Core Console application published for windows with all dependencies)
    • unzip.exe (taken from Git Bash on my local machine)
      • including the msys-2.0.dll and msys-bz2-1.dll dependencies

    The command in ADF is then:

    cmd /c "Unzip.exe executable-with-deps.zip && executable.exe"