Environment
Scenario
I have ADF pipeline which runs on daily basis.And create the Folder and file in following format.Basically the Date wise folder and files are created.
"/Samples/Data/TestDB/{filedate:yyyy}.{filedate:MM}.{filedate:dd}/TestDetail.{filedate:yyyy}.{filedate:MM}.{filedate:dd}.csv";
In U-SQL I want to process the last 7 days files. I created the U-SQL script by having virtual column to select last 7 days of files.
I am running these changes on locally.
Code
@userLoginData =
EXTRACT filedate DateTime,
UserLoginAuditDetailId Guid,
UserId string
FROM @userLoginDetailPath
USING Extractors.Csv();
@extractedLoginData =
SELECT
UserLoginAuditDetailId,
UserId
FROM @userLoginData
WHERE filedate.Date.ToString("d") == @sliceEndTime.AddDays(-7).Date.ToString("d");
Problem
I am able to run the script successfully, but the when the Job compile then job graph is showing, it is looking for the files which are more than 7 days. Please find below image.Basically it should not be looking for the file 2017.04.20. Right?
You can achieve the solution by using the BETWEEN
clause and comparing the your fileDate virtual column with the specified date something like,
DECLARE @ProcessStart DateTime = new DateTime(@sliceEndTime.Year, @sliceEndTime.Month,@sliceEndTime.Day).AddDays(-n); // n is the no of days
DECLARE @ProcessEnd DateTime = new DateTime(@sliceEndTime.Year, @sliceEndTime.Month,@sliceEndTime.Day); // current day when slice runs
and then you can filter the filedate using
SELECT * FROM input WHERE filedate BETWEEN @ProcessStart AND @ProcessEnd;
This will only process the files specified in the range of @ProcessStart
and @ProcessEnd
.
Let me know if it solves your issue.