I'm trying to use U-SQL and R to forecast, so i need to pass from U-SQL to R a list of values, and return forecast from R to U-SQL
All examples i found uses a reducer, so will process 1 row only.
https://learn.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-u-sql-r-extensions
Is it possible to instead of send to R a list of columns, send a list of rows to process?
Thanks!
By definition the User-defined reducers take n rows and produce one or more rows, use it to produce new column data but also new rows. The R extensions for U-SQL include a built-in reducer (Extension.R.Reducer) that runs R code on each vertex assigned to the reducer. You can get the input rowset with the special R parameter of "inputFromUSQL" and work on it with R.
Like you referenced this should work on all rows at once:
DECLARE @myRScript = @"
inputFromUSQL$mydata = as.factor(inputFromUSQL$mydata)
<..>
";
@myData = <my u-sql query>
@RScriptOutput = REDUCE @myData <..>
USING new Extension.R.Reducer(command:@myRScript, rReturnType:"dataframe")