Vertica has a very nice type of operations: Event-Based Window operations, which basically let you identify when an event occurs. For example the conditional_true_event will increment a counter each time the given boolean expression resolves to true. We use this kind of approach heavily.
We are thinking about moving to RedShift, but we would need a similar function. RedShift has some nice window functions, but I can't find this one.
Is there any way I can emulate this function using RedShift?
The CONDITIONAL_TRUE_EVENT()
is rather easy to write with window functions. It's just a COUNT
with a conditional (CASE
):
SELECT ts, symbol, bid,
CONDITIONAL_TRUE_EVENT(bid > 10.6)
OVER (ORDER BY ts) AS oce
FROM Tickstore3
ORDER BY ts ;
becomes:
SELECT ts, symbol, bid,
COUNT(CASE WHEN bid > 10.6 THEN 1 END)
OVER (ORDER BY ts
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS oce
FROM Tickstore3
ORDER BY ts ;
The CONDITIONAL_CHANGE_EVENT()
is more complicated because it needs to use the previous value. It can be emulated using LAG()
and SUM()
or COUNT()
(or ROW_NUMBER()
). But it will require I think a CTE or a derived table (or a self-join):
SELECT ts, symbol, bid,
CONDITIONAL_CHANGE_EVENT(bid)
OVER (ORDER BY ts) AS cce
FROM Tickstore3
ORDER BY ts ;
will become:
WITH emu AS
( SELECT ts, symbol, bid,
CASE WHEN bid <> LAG(bid) OVER (ORDER BY ts)
THEN 1
END AS change_bid
FROM Tickstore3
)
SELECT ts, symbol, bid,
COUNT(change_bid)
OVER (ORDER BY ts
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
AS cce
FROM emu
ORDER BY ts ;
I don't know how this CONDITIONAL_CHANGE_EVENT()
function behaves with nulls. If there are NULL
values in the checked for changes column - and you want to see if there is a change from the last non-null value and not just the previous one - the rewrite will be even more complicated.
Edit: As far as I understand Redshift's documentation an explicit window frame (ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) is required for window aggregates when there is an ORDER BY
. So, you can/have to use that (or whatever the default frame is in Vertica for these cases. It's either the above or with RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
).