Search code examples
sqlgoogle-bigqueryrow-number

ROW_NUMBER() fails when table is too big


I am working with Bigquery and I need to use ROW_NUMBER() in order to get only the first row which matches some condition.

Example:

select *except(rn)
from (
SELECT
  *,
  ROW_NUMBER() OVER (PARTITION BY id order by timedate desc) AS rn
FROM
 table
)
where rn = 1

However, the query will fail because table is too big. How can I apply such logic without running out of resources?


Solution

  • Below is for BigQuery Standard SQL

    #standardSQL
    SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
    FROM `project.dataset.table` t
    GROUP BY id
    

    You can test, play above with dummy data as below

    #standardSQL
    WITH `project.dataset.table` AS (
      SELECT 1 id, 2 timedate, 3 z UNION ALL
      SELECT 1,4,5 UNION ALL
      SELECT 1,6,7 UNION ALL
      SELECT 2,8,9 UNION ALL
      SELECT 2, 10, 11
    )
    SELECT AS VALUE ARRAY_AGG(t ORDER BY timedate DESC LIMIT 1)[OFFSET(0)]
    FROM `project.dataset.table` t
    GROUP BY id
    

    result is

    Row id  timedate    z    
    1   1   6           7    
    2   2   10          11