Search code examples
group-bysnowflake-cloud-data-platformrow-numberpartition-by

Snowflake Grouping by rows of 25 based on a column


i have a query that will create a calculated priority (c_priority) by store based on a primary priority column, the query works well to calculate c_priority using row_number() over (partition by store), however i need to go one step further and also sort this results by groups of 25 based on the campaign_column and priority and this is where i need help. I need the query to pull the first 25 records of the campaign with the highest priority, then pull the next 25 records of the following campaign and once it has finished with all campaigns gets back to the first campaign and pull another 25 records and cycle again until all records have a single sequence order (Calculated Priority) by store.

i will appreciate if someone can give an idea on how to achieve this, thanks!

current Query:

SELECT CONTACT_ID 
        ,LEAD_NUMBER
        ,LEAD_CREATION_DATE
        ,CUSTOMER_FIRST_NAME
        ,CUSTOMER_LAST_NAME 
        ,CAMPAIGN_NAME
        ,STORE
        ,MOBILE_NUMBER
        ,HOME_NUMBER
        ,WORK_NUMBER
        ,CELL_DO_NOT_CONTACT_FLAG
        ,HOME_DO_NOT_CONTACT_FLAG
        ,WORK_DO_NOT_CONTACT_FLAG
        ,PRIORITY
        ,ROW_NUMBER() OVER (PARTITION by store ORDER BY priority asc, campaign_name, store) AS     C_PRIORITY 
        ,HISTORICAL_DATE
        
FROM stores_campaigns_all

WHERE historical_date = '2023-03-10'
AND store IS NOT NULL

This is the current data output with my query: Current data output by my query

and this is how i want it to be, by rows of 2 to make it simple for this demostration how i need the data to be


Solution

  • This query creates groups of 2, and orders exactly as in your example data:

    with data as (
        select value campaign, uniform(1, 1000, random()) random_data, dense_rank() over(order by campaign) priority
        from table(split_to_table('A A A B B B C C C C D D D D E E E E F F',' '))
    ), row_number_by_campaign as (
        select *, row_number() over(partition by campaign order by campaign) rn
        from data
    ), count_in_groups_of_2 as (
        select *, floor((rn-1)/2) group_in_campaign
        from row_number_by_campaign
    )
    
    
    select campaign, random_data, priority, row_number() over(order by group_in_campaign, campaign, rn) c_priority
    from count_in_groups_of_2
    order by group_in_campaign, campaign, rn
    

    enter image description here