Search code examples
sqlgroupingmonetdb

MonetDB: Enumerate groups of rows based on a given "boundary" condition


Consider the following table:

id  gap  groupID
 0  0    1
 2  3    1
 3  7    2
 4  1    2
 5  5    2
 6  7    3
 7  3    3
 8  8    4
 9  2    4

Where groupID is the desired, computed column, such as its value is incremented whenever the gap column is greater than a threshold (in this case 6). The id column defines the sequential order of appearance of the rows (and it's already given).

Can you please help me figure out how to dynamically fill out the appropriate values for groupID?

I have looked in several other entries here in StackOverflow, and I've seen the usage of sum as an aggregate for a window function. I can't use sum because it's not supported in MonetDB window functions (only rank, dense_rank, and row_num). I can't use triggers (to modify the record insertion before it takes place) either because I need to keep the data mentioned above within a stored function in a local temporary table -- and trigger declarations are not supported in MonetDB function definitions.

I have also tried filling out the groupID column value by reading the previous table (id and gap) into another temporary table (id, gap, groupID), with the hope that this would force a row-by-row operation. But this has failed as well because it gives the groupID 0 to all records:

declare threshold int;
set threshold = 6;
insert into newTable( id, gap, groupID )
        select A.id, A.gap, 
            case when A.gap > threshold then 
                (select case when max(groupID) is null then 0 else max(groupID)+1 end from newTable) 
            else 
                (select case when max(groupID) is null then 0 else max(groupID) end from newTable) 
            end
        from A
        order by A.id asc;

Any help, tip, or reference is greatly appreciated. It's been a long time already trying to figure this out.

BTW: Cursors are not supported in MonetDB either --


Solution

  • You can assign the group using a correlated subquery. Simply count the number of previous values that exceed 6:

    select id, gap,
           (select 1 + count(*)
            from t as t2
            where t2.id <= t.id and t2.gap > 6
           ) as Groupid
    from t;