Search code examples
sqlpostgresqlcountpaginationlimit

Run a query with a LIMIT/OFFSET and also get the total number of rows


For pagination purposes, I need a run a query with the LIMIT and OFFSET clauses. But I also need a count of the number of rows that would be returned by that query without the LIMIT and OFFSET clauses.

I want to run:

SELECT * FROM table WHERE /* whatever */ ORDER BY col1 LIMIT ? OFFSET ?

And:

SELECT COUNT(*) FROM table WHERE /* whatever */

At the same time. Is there a way to do that, particularly a way that lets Postgres optimize it, so that it's faster than running both individually?


Solution

  • Yes. With a simple window function.

    Add a column with the total count

    SELECT *, count(*) OVER() AS full_count
    FROM   tbl
    WHERE  /* whatever */
    ORDER  BY col1
    OFFSET ?
    LIMIT  ?

    Be aware that the cost will be substantially higher than without the total number. Postgres has to actually count all qualifying rows either way, which imposes a cost depending on the total number. See:

    Two separate queries (one for the result set, one for the total count) may or may not be faster. But the overhead of executing two separate queries and processing results often tips the scales. Depends on the nature of the query, indexes, resources, cardinalities ...

    However, as Dani pointed out, when OFFSET is at least as great as the number of rows returned from the base query, no rows are returned. So we get no full_count, either. If that's a rare case, just run a second query for the count in this case.
    If that's not acceptable, here is a single query always returning the full count, with a CTE and an OUTER JOIN. This adds more overhead and only makes sense for certain cases (expensive filters, few qualifying rows).

    WITH cte AS (
       SELECT *
       FROM   tbl
       WHERE  /* whatever */
       -- ORDER  BY col1 -- ①
       )
    SELECT *
    FROM  (
       TABLE  cte
       ORDER  BY col1
       LIMIT  ?
       OFFSET ?
       ) sub
    RIGHT  JOIN (SELECT count(*) FROM cte) c(full_count) ON true;
    

    ① Typically it does not pay to add (the same) ORDER BY in the CTE. That forces all rows to be sorted. With LIMIT, typically only a small fraction has to be sorted (with "top-N heapsort").

    You get one row of null values, with the full_count appended if OFFSET is too big. Else, it's appended to every row like in the first query.

    If a row with all null values is a possible valid result you have to check offset >= full_count to disambiguate the origin of the empty row.

    This still executes the base query only once. But it adds more overhead to the query and only pays if that's less than repeating the base query for the count.

    Either way, the total count is returned with every row (redundantly). Doesn't add much cost. But if that's an issue, you could instead ...

    Add a row with the total count

    The added row must match the row type of the query result, and the count must fit into the data type of one of the columns. A bit of a hack. Like:

    WITH cte AS (
       SELECT col1, col2, int_col3
       FROM   tbl
       WHERE  /* whatever */
       )
    SELECT null AS col1, null AS col2, count(*)::int AS int_col3  -- maybe cast the count
    FROM   cte
    UNION ALL
    (  -- parentheses required
    TABLE  cte
    ORDER  BY col1
    LIMIT  ?
    OFFSET ?
    );
    

    Again, sometimes it may be cheaper to just run a separate count (still in a single query!):

    SELECT null AS col1, null AS col2, count(*)::int AS int_col3
    FROM   tbl
    WHERE  /* whatever */
    UNION ALL
    (  -- parentheses required
    SELECT col1, col2, int_col3
    FROM   tbl
    WHERE  /* whatever */
    ORDER  BY col1
    LIMIT  ?
    OFFSET ?
    );
    

    About the syntax shortcut TABLE tbl: