Search code examples
azure-data-factoryazure-data-explorerazure-java-sdk

Azure Data Factory Java SDK build dataset with Azure Data Explorer (Kusto) query


I want to migrate data between 2 Azure Data Explorer Clusters with filter on the source cluster. I'm creating Copy job on Azure Data Factory for automate and orchestrate it. I'm using the Java SDK in order to create the job resources. I have issues to create the source dataset I can not find where should I place the query. I Also didn't find any mention for it in the documention and the code samples.

here is my Dataset creation code:

    Dataset sourceDataset = dataFactoryManager.datasets()
        .define("source_table")
        .withExistingFactory(rgName, dfName)
        .withProperties(new AzureDataExplorerTableDataset()
            .withLinkedServiceName(new LinkedServiceReference()
                .withReferenceName(sourceLinkedService.name()))
        );

And I want it to filter with this query:

"Table | where $ingestionTime > ago(1d)"

Can you help me understand what are the functions I should use in order to add the query?

I'm expecting the data will be copied after it was filtered


Solution

  • Found that the filter should be created on the Copy Activity and not on the Data Set

        PipelineResource pipeline = dataFactoryManager.pipelines().define("CopyTablePipeline")
        .withExistingFactory(rgName, dfName)
        .withActivities(Collections.singletonList(new CopyActivity()
            .withName("CopyTable")
            .withSource(new AzureDataExplorerSource().withQuery("Table" + " | where " + "$ingestionTime > ago(1d)"))
            .withSink(new AzureDataExplorerSink())
            .withInputs(Collections.singletonList(new DatasetReference().withReferenceName(inputDatasetName)))
            .withOutputs(Collections.singletonList(new DatasetReference().withReferenceName(outputDatasetName)))))
        .create();