Search code examples
sqlgoogle-bigqueryanalytic-functions

Using a window function in BigQuery to create running sum of active quarters


I am working to enhance a dataset by creating a column that would allow me to track how many active quarters a given company has had for a given row. A company is "active" if they recognize revenue within that quarter. Each row of my dataset represents one month's performance for a single company.

I have been able to use a WINDOW function to create a running sum for active months successfully:

COUNTIF(Revenue IS NOT NULL) OVER 
(partition by Company_Name ORDER BY month_end ASC ROWS BETWEEN unbounded preceding and current row) AS cumulative_active_months

I am now struggling to convert my logic to count the quarters rather than the months.

This is a rough idea of what my table currently looks like.

  Row   Month   Month_end    Fiscal_Quarter   Company_Name   Revenue   Active month count  
 ----- ------- ------------ ---------------- -------------- --------- -------------------- 
  1     Jul     2016-07-31   FY17-Q2          Foo            x,xxx     1                   
  2     Jul     2016-07-31   FY17-Q2          Bar            xxx,xxx   1                   
  3     Aug     2016-08-31   FY17-Q2          Foo            xx,xxx    2                   
  4     Aug     2016-08-31   FY17-Q2          Bar            xxx       2                   
  5     Sep     2016-09-30   FY17-Q2          Foo            xx        3                   
  6     Sep     2016-09-30   FY17-Q2          Bar            x,xxx     3                   
  7     Oct     2016-10-31   FY17-Q3          Foo            xx        4                   
  8     Oct     2016-10-31   FY17-Q3          Bar            Null      3                 

This what ideally I'd like for my table to look like.

  Row   Month   Month_end    Fiscal_Quarter   Company_Name   Revenue   Active month count   Active quarter count  
 ----- ------- ------------ ---------------- -------------- --------- -------------------- ---------------------- 
  1     Jul     2016-07-31   FY17-Q2          Foo            x,xxx     1                    1                     
  2     Jul     2016-07-31   FY17-Q2          Bar            xxx,xxx   1                    1                     
  3     Aug     2016-08-31   FY17-Q2          Foo            xx,xxx    2                    1                     
  4     Aug     2016-08-31   FY17-Q2          Bar            xxx       2                    1                     
  5     Sep     2016-09-30   FY17-Q2          Foo            xx        3                    1                     
  6     Sep     2016-09-30   FY17-Q2          Bar            x,xxx     3                    1                     
  7     Oct     2016-10-31   FY17-Q3          Foo            xx        4                    2                     
  8     Oct     2016-10-31   FY17-Q3          Bar            Null      3                    1                     

Solution

  • If this is counting active months:

    COUNTIF(Revenue IS NOT NULL) OVER (PARTITION BY Company_Name  ORDER BY month_end ASC) AS cumulative_active_months
    

    Then this is the corresponding count for quarters would use COUNT(DISTINCT):

    COUNT(DISTINCT CASE WHEN Revenue IS NOT NULL THEN Fiscal_Quarter END)  OVER (PARTITION BY Company_Name ORDER BY month_end ASC) AS cumulative_active_quarters
    

    Unfortunately, BigQuery does not support this, so you can use a subquery and cumulative sum:

    select t.* except (seqnum),
           countif(seqnum = 1) over (partition by company_name order by month_end) as cnt
    from (select t.*,
                 (case when revenue is not null
                       then row_number() over (partition by Company_Name, Fiscal_Quarter order by month_end)
                       else 0
                  end) as seqnum
          from t
         ) t;
    

    Note: This does not count the current quarter until there is revenue, which I think makes sense.