Search code examples
nullsnowflake-cloud-data-platformpartitioninglag

How to partition to get the rows where a value goes from non NULL to NULL


I have a table as follows where the ATTRIBUTES column contains a json of values.

ID ORDER ATTRIBUTES
12 34 {shape: 'square', size: 'small', length: 14}
12 41 {size: 'small', length: 14}
12 22 {shape: 'square', size: 'large', length: 14}
99 42 {size: 'small', length: 14}
14 33 {shape: 'circle', size: 'small', length: 5}
14 58 {size: 'small', length: 15}
29 16 NULL

I'm trying to have a query that gets the rows where ATTRIBUTES:shape goes FROM having a value TO being NULL based on ORDER ascending. So for example my resulting table would be

ID ORDER ATTRIBUTES
12 41 {size: 'small', length: 14}
14 58 {size: 'small', length: 15}

where ID 99 is not included since there was no other record with that ID that had a non NULL shape value (so it didn't go from non NULL to NULL, it was originally just NULL) and ID 29 isn't included since the entire field for that column is NULL.

I tried doing:

SELECT *, 
       lag(ATTRIBUTES:shape, 1) OVER (PARTITION BY ID
       ORDER BY ORDER) AS NULL_SHAPE

FROM MY_TABLE;

However I am not able to identify where the ATTRIBUTES:shape becomes NULL from the previous row, is there a way I can specify this? Thanks


Solution

  • enter image description here
    Functions used: QUALIFY() LAG()

    with cte as (
    select 12 id, 34 order_id, parse_json('{shape: "square", size: "small", length: 14}') attributes union all
    select 12 id, 41 order_id, parse_json('{size: "small", length: 14}') attributes union all
    select 12 id, 22 order_id, parse_json('{shape: "square", size: "large", length: 14}') attributes union all
    select 99 id, 42 order_id, parse_json('{size: "small", length: 14}') attributes union all
    select 14 id, 33 order_id, parse_json('{shape: "circle", size: "small", length: 5}') attributes union all
    select 14 id, 58 order_id, parse_json('{size: "small", length: 15}') attributes union all
    select 29 id, 16 order_id, NULL attributes)
    
    select 
        *  
    from 
        cte  
    qualify 
        lag(attributes:shape::string)over(partition by ID order by order_id asc) is not null 
        and attributes:shape::string is null;