This is my code to eliminate null cells and duplicate Function rows while also keeping the Product column properly aligned with the Function column. I just want to keep the first occurrence of the Function and remove any duplicates.It compiles just fine, but I can't find my output. Someone suggested I simply click on the outputted jobURL but that is not working for me properly. Here is a sample file that is a small slice of the full spreadsheet and only includes data in the 2 relevant columns. The full spreadsheet has data in all columns. https://www.dropbox.com/s/auu2aco4b037xn7/Function.csv?dl=0
@input =
EXTRACT
CompanyID string,
division string,
store_location string,
International_Id string,
Function string,
office_location string,
address string,
Product string,
Revenue string,
sales_goal string,
Manager string,
Country string
FROM "/input/input142.csv"
USING Extractors.Csv(skipFirstNRows : 1 );
// Remove empty columns
@working =
SELECT *
FROM @input
WHERE Function.Length > 0;
// Rank the columns by Function and keep only the first one
@working =
SELECT CompanyID,
division,
store_location,
International_Id,
Function,
office_location,
address,
Product,
Revenue,
sales_goal,
Manager,
Country
FROM
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Function ORDER BY Product)
AS rn
FROM @working
) AS x
WHERE rn == 1;
@output = SELECT * FROM @working;
OUTPUT @output TO "/output/output.csv"
USING Outputters.Csv(quoting:false);
Here are my desired results: https://www.dropbox.com/s/o82eskycbq1i1ss/Function_desired_result.xlsx?dl=0
check this document if you want to run/debug your scripts locally