Search code examples
google-bigqueryconditional-statementslimit

Conditionally LIMIT in BigQuery


I have read that in Postgres setting LIMIT NULL will effectively not limit the results of the SELECT. However in BigQuery when I set LIMIT NULL based on a condition I see Syntax error: Unexpected keyword NULL.

I'd like to figure out a way to limit or not based on a condition (could be an argument passed into a procedure, or a parameter passed in by a query job, anything I can write a CASE or IF statement for). The mechanism for setting the condition shouldn't matter, what I'm looking for is whether there is a way to syntactically indicate a value for LIMIT, that will not limit, in a valid way to BigQuery.


Solution

  • The LIMIT clause works differently within BigQuery. It specifies the maximum number of depression inputs in the result. The LIMIT n must be a constant INT64.

    Using the LIMIT clause, you can overcome the limitation on cache result size:

    • Using filters to limit the result set.
    • Using a LIMIT clause to reduce the result set, especially if you are using an ORDER BY clause.

    You can see this example:

    SELECT
      title
    FROM
      `my-project.mydataset.mytable`
    ORDER BY
      title DESC
    LIMIT
      100
    

    This will only return 100 rows.

    The best practice is to use it if you are sorting a very large number of values. You can see this document with examples.

    If you want to return all rows from a table, you need to omit the LIMIT clause.

    SELECT
      title
    FROM
      `my-project.mydataset.mytable`
    ORDER BY
      title DESC
    

    This example will return all the rows from a table. It is not recommended to omit LIMIT if your tables are too large, as it will consume a lot of resources.

    One solution to optimize resources is to use cluster tables. This will save costs and querying times. You can see this document with a detailed explanation of how it works.