Search code examples
sqlgoogle-bigquerywindow-functions

Cumulative Distribution Window Function in Big Query


I have a table with the number of events that have happened each day:

--ref_date--|--num_events

I want to create a window function that evaluates for each day what is the percentage of days in the past that have had less events than in the current day (a cumulative distribution function).

This is what I have tried:

SELECT
    ref_date,
    num_events,
    AVG(CASE WHEN num_events < LAG(events, 0) OVER (ORDER BY ref_date ASC) THEN 1 ELSE 0 END)
        OVER (ORDER BY ref_date ASC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS cumulative_distribution
FROM initial_table

But I am getting the error "Analytic function cannot be an argument of another analytic function".

Do you know how to create this window function?


Solution

  • A scalar subquery should work:

    select ref_date,
        (
        select count(case when t2.num_events < t1.num_events then 1 end) * 1.0 / count(*)
        from T t2
        where t2.ref_date < t1.ref_date
        ) as cd
    from T t1;