Search code examples
sqldatabaseclickhouse

SQL clickhouse how to use neighbor func based on value in different column


I have a problem in my sql query. I try to find difference betweent current and previous value (current value - previous value in other words delta) in column "amount" But i need make it per "id" So here example of my sql query and current and wanted result. Here is a problem that when next row have differnt "id" query is still continue to finding diff between different id's is. Not clear how to fix it. Any ideas? I use clickhouse DB

SELECT 
toInt64(visitParamExtractString( (JSONExtractString(message,'id')) ,'id' )) as id,
visitParamExtractString(message ,'name' ) as name,
toFloat64(visitParamExtractString(message, 'amount')) as amount,
round((amount - neighbor(amount, -1)),2) as diff
FROM stage1.log dl 
WHERE   time     >= '2023-04-22 13:01:00'
    and time     <= '2023-04-22 13:01:30'
    and name in ('apple')
    and amount > 0 
order by time







currently:

id          name    amount      diff
14045254695 apple   3392.92     3392.92
14045254695 apple   3408.39     15.47
14045254695 apple   3451.0      42.61
14045254729 apple   10.03       -3440.97  <- here is problem that script make
14045254729 apple   1700.0      1689.97     10.03 - 3451.0 calculation


wants to be like:

id          name    amount      diff
14045254695 apple   3392.92     3392.92
14045254695 apple   3408.39     15.47
14045254695 apple   3451.0      42.61
14045254729 apple   10.0        10.0
14045254729 apple   1700.0      1690

Solution

  • seems like you could use lag/leadinframe and 'partition by' the id

    https://clickhouse.com/docs/en/sql-reference/window-functions

    SELECT 
    toInt64(visitParamExtractString( (JSONExtractString(message,'id')) ,'id' )) as id,
    visitParamExtractString(message ,'name' ) as name,
    toFloat64(visitParamExtractString(message, 'amount')) as amount,
    round((amount - laginframe(amount, 1) over(partition by id),2) as diff
    FROM stage1.log dl 
    WHERE   time     >= '2023-04-22 13:01:00'
        and time     <= '2023-04-22 13:01:30'
        and name in ('apple')
        and amount > 0 
    order by time