Search code examples
sqlpostgresqluniongreatest-n-per-group

UNION works with CTE, but not without?


I'm using PostgreSQL 11.12.

I have this query:

select first_name, length(first_name)
from db_employee
where length(first_name) = 
(select max(length(first_name)) from db_employee)
order by 1 
limit 1
union
select first_name, length(first_name)
from db_employee
where length(first_name) = 
(select min(length(first_name)) from db_employee)
order by 1 
limit 1

and when I run it I get this error:

syntax error at or near "union"

If I use CTE it works:

with cte1 as
(select first_name, length(first_name)
from db_employee
where length(first_name) = 
(select max(length(first_name)) from db_employee)
order by 1 
limit 1),

cte2 as
(select first_name, length(first_name)
from db_employee
where length(first_name) = 
(select min(length(first_name)) from db_employee)
order by 1 
limit 1)

select * from cte1
union
select * from cte2;

Why does the first query result in syntax error? Is it possible to get it to work without using CTE?


Solution

  • To include a LIMIT clause per SELECT in a UNION query, you must add parentheses. Like:

    (  -- !
    SELECT first_name, length(first_name)
    FROM   db_employee
    WHERE  length(first_name) = (SELECT max(length(first_name)) FROM db_employee)
    ORDER  BY 1
    LIMIT  1
    ) -- !
    UNION
    ( -- !
    SELECT first_name, length(first_name)
    FROM   db_employee
    WHERE  length(first_name) = (SELECT min(length(first_name)) FROM db_employee)
    ORDER  BY 1
    LIMIT  1
    ) -- !
    

    Related:

    That said, your query can be optimized. Use instead:

    (
    SELECT first_name, length(first_name)
    FROM   db_employee
    ORDER  BY length(first_name) DESC NULLS LAST, first_name
    LIMIT  1
    )
    UNION ALL  -- my guess
    (
    SELECT first_name, length(first_name)
    FROM   db_employee
    ORDER  BY length(first_name), first_name
    LIMIT  1
    );
    

    Subtle difference: this does not completely exclude rows with first_name IS NULL, but that only shows if all rows have first_name IS NULL.

    About NULLS LAST: