i have those tables
table1
| id | name |
| 1 | axe |
| 2 | bow |
| 3 | car |
| 4 | dart |
table2 table3
| t1_id | number | | t1_id | letter |
| 1 | 5 | | 1 | a |
| 1 | 6 | | 1 | b |
| 1 | 2 | | 1 | c |
| 2 | 2 | | 2 | a |
| 2 | 2 | | 2 | c |
| 2 | 3 | | 2 | r |
| 3 | 8 | | 3 | y |
| 3 | 3 | | 3 | i |
| 3 | 1 | | 3 | a |
| 4 | 8 | | 4 | a |
| 4 | 9 | | 4 | b |
| 4 | 10 | | 4 | c |
and table1(id) is linked with table2(t1_id), table3(t1_id)
i run it to get them order by the highest letter_count match then by the highest average_number match to get this proper result http://www.sqlfiddle.com/#!9/69086b/8/0
SELECT
t1.id,
t1.name
FROM
table1 t1
INNER JOIN
table2 t2
ON t2.t1_id = t1.id
LEFT JOIN
table3 t3
ON t3.t1_id = t1.id
AND t3.letter IN ('a', 'b', 'c')
GROUP BY
t1.id
ORDER BY
COUNT(t3.letter) DESC,
AVG(t2.number) DESC
| id | name |
| 4 | dart |
| 1 | axe |
| 2 | bow |
| 3 | car |
and everything is working ok
but when i wanted to check if there is any problems with the query i decided to check the letter_count and avg_number so i used this query
SELECT
t1.id,
t1.name,
COUNT(t3.letter) AS letter_count,
AVG(t2.number) AS avg_number
FROM
table1 t1
INNER JOIN
table2 t2
ON t2.t1_id = t1.id
LEFT JOIN
table3 t3
ON t3.t1_id = t1.id
AND t3.letter IN ('a', 'b', 'c')
GROUP BY
t1.id
ORDER BY
letter_count DESC,
avg_number DESC
what i expected the result to be was
| id | name | letter_count | avg_number |
| 4 | dart | 3 | 9 |
| 1 | axe | 3 | 4.3333333333 |
| 2 | bow | 2 | 2.3333333333 |
| 3 | car | 1 | 4 |
but the result i got was http://www.sqlfiddle.com/#!9/69086b/3/0
| id | name | letter_count | avg_number |
| 4 | dart | 9 | 9 |
| 1 | axe | 9 | 4.3333333333 |
| 2 | bow | 6 | 2.3333333333 |
| 3 | car | 3 | 4 |
what surprised me was the multiplied row of letter_count which can be solved by derived queries but i don't want to select the letter_count or number_average i only want to ORDER BY them
does keeping the query like it is with ORDER BY only wont affect the query performance or should i still use derived queries even if i don't need to select the data values since the order is correct anyway or would derived query be faster in huge tables?
You've really asked 2 questions here:
ORDER BY
clause affect query performanceEither way, to evaluate the ORDER BY
clause, the expressions need to be evaluated so that the order can be determined. In your first example you need to specify the expressions because those columns are not contained within the SELECT
statement.
However in your second query you have selected the columns that you want to order by and because ORDER BY
is evaluated AFTER the entire query has been processed, you can simply use the column ALIAS in the ORDER BY
clause, instead of executing the function again.
some RDBMS query optimisers will convert your expressions in the
ORDER BY
statements to use the column alias for you IF you are ordering on expressions that exist in yourSELECT
clause
You were right to do this however, your letter count expression is not current due to the duplicates in the response.
You can simply change the COUNT
expression to use a distinct clause to only count unique values.
COUNT(DICTINCT t3.letter)
This makes your original query now look like this:
SELECT
t1.id,
t1.name
FROM
table1 t1
INNER JOIN
table2 t2
ON t2.t1_id = t1.id
LEFT JOIN
table3 t3
ON t3.t1_id = t1.id
AND t3.letter IN ('a', 'b', 'c')
GROUP BY
t1.id
ORDER BY
COUNT(DICTINCT t3.letter) DESC,
AVG(t2.number) DESC