Search code examples
partitioningalteryx

Alteryx to select top N records where N=a value on that group


I'm in a fix with Alteryx. I'm trying to select the top N rows where N=a cell value for that partition. The business question is:

"We need to know, out of our orders (TicketIDs), those that have least 1 combination of Type of discount item AND drink AND side."

The SQL query would join this table onto itself and partition to get the TopNtoIncludeInItems for that row, however, I just can't seem to find a way to do this in Alteryx. I've tried the community, but the question has gone ananswered.

In other words, select thusly:

<pseudocode>

for each (TicketID)
    for each(Type)
        select top(TopNtoIncludeInItems for this.TicketID) 
    next
next

</pseudocode>

or indeed select just the green records

alteryx screenie


Solution

  • Here's my solution:

    • MultiRow Formula: create new field ComboCount (or whatever) as Int32, 0 or empty for rows that don't exists, Group By TicketID and Type, with the Expression [Row-1:ComboCount]+1 ... this counts up each group; we'll want the first topN of each group, ensuring the group actuall has that many, and not going beyond TopN.
    • Filter on [ComboCount] <= [TopN] ... which excludes unnecessary rows beyond TopN
    • Summarize: group by TicketID and Type, doing Max(ComboCount) ... if this value is less than TopN for any group, the group should be excluded:
    • Join the summary back to the earlier pre-summary data on TicketID and Type
    • Filter on [Max_ComboCount] = [TopN] ... this excludes the groups where any ItemType falls short of TopN

    And that's it. Pictorally, this is what my workflow looks like, along with data results based on data similar to that in your screenshot: enter image description here