Search code examples
sqlsnowflake-cloud-data-platformlag

Snowflake get previous value that was different


I have a table that tracks location by id along with dates a devices was issued and the product/accessory that has been associated to the device. The location can only have 1 device at a time, and the device may get a new accessory every so often. Each time a new accessory is associated, a new record is created.
I am trying to add to each record the previous device information if there was a one.

I have a table like below:

ID_NO DEVICE_NO DEVICE_DATE PRODUCT_NO PRODUCT_DATE
FD2A 600076 2011-09-20 210785 2012-01-03
FD2A 208049 2017-09-11 066762 2017-09-11
FD2A 208049 2017-09-11 009802 2023-09-12
C600 202650 2009-03-25 127677 2009-03-25
C600 215580 2012-04-04 127677 2010-10-06
C600 215580 2012-04-04 245791 2012-04-10
C600 215580 2012-04-04 366424 2013-09-06
C600 215580 2012-04-04 105547 2014-01-31
C600 215580 2012-04-04 503592 2015-10-01
C600 209855 2015-11-16 484106 2015-10-09
C600 600382 2020-08-24 347302 2016-08-25

Using the following query:

select
id_no
,device_no
,device_date
,product_no
,product_date
,lag(device_no) over (partition by id_no order by device_date, product_date) prev_device_no
,lag(device_date) over (partition by id_no order by device_date, product_date) prev_device_date
from device_data
order by id_no,device_date,product_date

I get the following result:

ID_NO DEVICE_NO DEVICE_DATE PRODUCT_NO PRODUCT_DATE PREV_DEVICE_NO PREV_DEVICE_DATE
FD2A 600076 2011-09-20 210785 2012-01-03
FD2A 208049 2017-09-11 066762 2017-09-11 600076 2011-09-20
FD2A 208049 2017-09-11 009802 2023-09-12 208049 2017-09-11
C600 202650 2009-03-25 127677 2009-03-25
C600 215580 2012-04-04 127677 2010-10-06 202650 2009-03-25
C600 215580 2012-04-04 245791 2012-04-10 215580 2012-04-04
C600 215580 2012-04-04 366424 2013-09-06 215580 2012-04-04
C600 215580 2012-04-04 105547 2014-01-31 215580 2012-04-04
C600 215580 2012-04-04 503592 2015-10-01 215580 2012-04-04
C600 209855 2015-11-16 484106 2015-10-09 215580 2012-04-04
C600 600382 2020-08-24 347302 2016-08-25 209855 2015-11-16

What I really want to do is get the previous device_no and date that was different: like this:

ID_NO DEVICE_NO DEVICE_DATE PRODUCT_NO PRODUCT_DATE PREV_DEVICE_NO PREV_DEVICE_DATE
FD2A 600076 2011-09-20 210785 2012-01-03
FD2A 208049 2017-09-11 066762 2017-09-11 600076 2011-09-20
FD2A 208049 2017-09-11 009802 2023-09-12 600076 2011-09-20
C600 202650 2009-03-25 127677 2009-03-25
C600 215580 2012-04-04 127677 2010-10-06 202650 2009-03-25
C600 215580 2012-04-04 245791 2012-04-10 202650 2009-03-25
C600 215580 2012-04-04 366424 2013-09-06 202650 2009-03-25
C600 215580 2012-04-04 105547 2014-01-31 202650 2009-03-25
C600 215580 2012-04-04 503592 2015-10-01 202650 2009-03-25
C600 209855 2015-11-16 484106 2015-10-09 215580 2012-04-04
C600 600382 2020-08-24 347302 2016-08-25 209855 2015-11-16

is there another function to get the last different value when partitioning?


Solution

  • I think I have something that will work for you here. I only tested it with the first data group, so it might need tweaking.

    What I do is, take your table and find all of the records that are different based on device_no. I am making assumption that the device_no does not change and then change back to a previous value....I think that would break this.

    Once I have that in the data_delta CTE, then I can do an asof join to get the previous "change record" for every record that I had in the original table.

    Full code looks like this, but the first data CTE is just some of your sample data:

    with data as (
        select *
        from (values('FD2A',600076,'2011-09-20'::date,'2012-01-03'::date),      
        ('FD2A',208049,'2017-09-11'::date,'2017-09-11'::date),
        ('FD2A',208049,'2017-09-11'::date,'2023-09-12'::date)
        ) x (id_no,device_no,device_date,product_date)
    ), data_delta as (
        select * from data
        qualify row_number() over (partition by id_no, device_no order by device_date, product_date) = 1
    )
    select d1.*
         , d2.device_no as prev_device_no
         , d2.device_date as prev_device_date
    from data d1
    asof join data_delta d2
      match_condition(d1.device_date > d2.device_date)
      on (d1.id_no = d2.id_no)
    order by d1.product_date
      ;