loops azure cursor azure-data-lake u-sql

U-SQL cursor in Azure Data Lake

I need to process data in Azure Data Lake. My flow is as follows:

I would like to select from the database list of IDs for next processing. This I have done.
I need to iterate through IDs (from the first step) and I need to successively export data into separated files (partitioned by ID)

The problem is following statemanet:

U-SQL’s procedures do not provide any imperative code-flow constructs such as a for or while loops.

Any idea how to process data in similar way as with cursor?

I didn't find any documentation regarding to the cursors in U-SQL.

Thank you!

Solution

There are no cursors in U-SQL, because of the statement you reference above.

U-SQL does not provide any imperative code-flow constructs because it impedes the optimizer's ability to globally optimize your script.

You should think of approaching your problem declaratively. For example, if you have a list of IDs (either in a table or SqlArray or even a file), use a declarative join. For example, you want to add 42 to every value where the key is in a list of existing keys:

// Two options for providing the "looping data"
// Option 1: Array Variable
DECLARE @keys_var = new SqlArray<string>{"k1", "k2", "k3"};

// Option 2: Rowset (eg from an EXTRACT from file, a table or other place)
@keys = SELECT * FROM (VALUES("k1"), ("k2"), ("k3")) AS T(key);

// This is the data you want to iterate over to add 42 to the value column for every matching key
@inputdata = SELECT * FROM (VALUES (1, "k1"), (2, "k1"), (3, "k2"), (6, "k5")) AS T(value, key);

//Option 1:
@res = SELECT value+42 AS newval, key FROM @inputdata WHERE @keys_var.Contains(key);

OUTPUT @res TO "/output/opt1.csv" USING Outputters.Csv();

//Option 2:
@res = SELECT value+42 AS newval, i.key 
    FROM @inputdata AS i INNER JOIN @keys AS k 
         ON i.key == k.key;

OUTPUT @res TO "/output/opt2.csv" USING Outputters.Csv();

Now in your case, you want to have data-driven output file sets. This feature is currently being worked on (it is one of our top asks). Until then you would have to write a script to generate the script (I will provide an example on your other question).