U-SQL job performance

Could you help me with the job performance? I runned it with 10 AUs. And at first part of time they are used almost all. But from the second half of the execution time it uses only 1 AU. I see in the plan a one supervertex consists from only one vertex, it looks like underestimated execution plan (it is just assumption).

I'm trying to analyze the execution time but it is difficult without technical description of operations like HashCombine, HashCross, ...

So my question could i do something with it (modify code, add hints, etc)?

my job ID link

The problem was fixed with Mychael Rys's solution.

I applied Michael Rys's solution and it works perfect. Thank you as always! See the pic below. Almost all 10AUs from 10AUs are used now. Also i played with modeling tool and looks like the script is scaled near to linearly. Awesome :).

One more solution

Also I can replace inner joins by left joins (the replacement will be equivalent in my case because in dimension tables ALWAYS exists only one row for any record in a fact table dim-1:M-fact). The CBO estimate join’s results cardinality as “at least not less than the fact table”. In the case CBO generate good plan without hints.

Solution

I will forward your job link to one of our devs who can take a look and update this answer once I get more information.

For stackoverflow help though, it would be helpful to see the script and/or the job graph as well. For example, how much data are you operating on? Are you using an operation that implies an ordering or grouping, etc.

Based on the vertex execution view, it seems that you are extracting from many small files that each contains only a small amount of data. It could well be that the optimizer assumes that there is only a small amount of data coming from these files.

You could add an OPTION(ROWCOUNT= xxx) hint to the EXTRACT statement to hint a larger number of rows (xxx would be the number to force the system to parallelize), assuming my initial assumption is correct.

Some Additional Information after looking at the job

The plan is a 13-way join with 12 dimension tables and 1 fact table. The error (underestimation causing serial plan) begins after 9 of the 12 joins are completed. The last 3 joins – with dim_application, dim_operation, and dim_data_type, are done serially. The spine of the plan still has 29GB. It’s very hard for us to estimate through 9 joins, since we don’t have foreign key information.

The most likely thing you can do to make this work is

Split the join statement into 2, with the joins to dim_application, dim_operation, and dim_data_type in the second one.
Add the ROWCOUNT hint on the output of the first join statement with a large number.

Let me know if that helps.