Search code examples
azure-data-factoryu-sql

U-SQL - Process Last "n" days files


Environment

  • Azure Data Factory/U-SQL

Scenario

  • I have ADF pipeline which runs on daily basis.And create the Folder and file in following format.Basically the Date wise folder and files are created.

  • "/Samples/Data/TestDB/{filedate:yyyy}.{filedate:MM}.{filedate:dd}/TestDetail.{filedate:yyyy}.{filedate:MM}.{filedate:dd}.csv";

  • In U-SQL I want to process the last 7 days files. I created the U-SQL script by having virtual column to select last 7 days of files.

  • I am running these changes on locally.

Code

 @userLoginData =
        EXTRACT filedate DateTime,
                UserLoginAuditDetailId Guid,
                UserId string              
        FROM    @userLoginDetailPath   
        USING   Extractors.Csv();

  @extractedLoginData = 
        SELECT 
                UserLoginAuditDetailId,
                UserId             
        FROM    @userLoginData
        WHERE   filedate.Date.ToString("d") == @sliceEndTime.AddDays(-7).Date.ToString("d"); 

Problem

I am able to run the script successfully, but the when the Job compile then job graph is showing, it is looking for the files which are more than 7 days. Please find below image.Basically it should not be looking for the file 2017.04.20. Right?

Job Graph


Solution

  • You can achieve the solution by using the BETWEEN clause and comparing the your fileDate virtual column with the specified date something like,

    DECLARE @ProcessStart DateTime = new DateTime(@sliceEndTime.Year, @sliceEndTime.Month,@sliceEndTime.Day).AddDays(-n); // n is the no of days 
    DECLARE @ProcessEnd DateTime = new DateTime(@sliceEndTime.Year, @sliceEndTime.Month,@sliceEndTime.Day); // current day when slice runs
    

    and then you can filter the filedate using

    SELECT * FROM input WHERE    filedate  BETWEEN  @ProcessStart  AND  @ProcessEnd;
    

    This will only process the files specified in the range of @ProcessStart and @ProcessEnd.

    Let me know if it solves your issue.