I have input dataset that logically splitted by sessions.
During processing i need to produce one more column which will contain hash value that is calculated based on rows per session. Every row in the session will be stamped with the hash value (the same within session). Input/output cardinality will be the same.
The pic shows what i want to have.
I think of using .net custom reducer or processor. Am i on right way? What to choose or how to implement it in U-SQL properly?
It sounds like the hash for a session requires knowledge of all the rows in the a session and for that reason a processor is not helpful, but a reducer could do this.
Consider also if this can be done via a custom Aggregator. For example you could use a user-defined aggregator to produce a hash for each session and then join the result of the aggregation with the the original list of rows.