Search code examples
sqlmysqlsql-order-by

Unknown column [col_name] in 'order clause'


Select CONCAT ( Name, '(' , LEFT(occupation,1) , ')'   )
FROM occupations
UNION
Select 
CONCAT ('There are  a total of  ', Count(Occupation) ,' ', occupation, 's' ) as line
FROM occupations
Group by occupation
ORDER by line

Why can I not use 'line' in Order by even though it is in Select statement and gets executed before. I can use it correctly without the UNION but once I use UNION statement it is not working and I also tried Order by Count(occupation) it did not work


Solution

  • Why can I not use 'line' in Order by ...

    The question here is about the "why", rather than asking how to solve the issue, so I'll focus on that first.

    Remember that database tables (relations) are, by definition, unordered sets. Therefore, when you use UNION, the ORDER BY only has any meaning relative to the entire query. You cannot order by one of the component queries that make up the UNION, because the other component is not ordered and until an ORDER BY is applied the result of the UNION is not ordered.

    Furthermore, the column names of a component query within a UNION have no meaning for the result as a whole. For example, you can UNION two SELECT queries using columns with different names in the same position. When there is a conflict, the names from first component of the UNION are used. Likewise, other source columns from the tables/queries used in a UNION no longer have any meaning after the UNION (unless included as part of the UNION).

    Therefore, the ORDER BY expression from the question MUST only be in reference to a column defined for the UNION as a whole, not just the 2nd component, and at this point the line column no longer exists (it was absorbed into the larger UNION).

    I can hear it now: "But I want the order on the second query and don't care about the first."

    That's not allowed. If you don't care about the order in the first part of the query, then the entire order is non-deterministic and the database is free to order the entire result in whatever way it finds most efficient. Furthermore, you aren't even allowed to assume it will show results from the first component of the UNION before later components of the UNION (this may often happen, but it's not guaranteed!). If you care about the order at all, you MUST have an ORDER BY for the entire result set.

    The common solution to avoid this confusion nests the UNION into a larger SELECT query, and do all your ORDER BY with the outer query. Additionally, you can use an extra column projection if you need to sort one component before another:

    SELECT Line
    FROM (
        select concat( Name, '(' , LEFT(occupation,1) , ')'   ) as Line
           , 0 as Ordinal
        from occupations
    
        union
    
        select 
        concat('There are  a total of  ', count(Occupation) ,' ', occupation, 's' ) 
            , 1 as Ordinal
        from occupations
        group by occupation
    ) t
    ORDER BY Ordinal, Line
    

    See it here (with appreciation to the other answer saving me time on the fiddle):

    https://dbfiddle.uk/RnABtt2U