My question is : is there any order of execution in ON
clauses in a typical INNER JOIN
?
My tables are filled in about a million entries each table, I need to join them easily and I want to optimize my query.
Here's my example:
SELECT *
FROM TableA
INNER JOIN TableB ON
(
[costly validations]
)
AND
TableB.Date1 > '2016-01-01' AND
TableA.Date2 > '2016-01-01' AND
TableB.Date1 = TableA.Date2 AND
TableA.Field1 = TableB.Field1 AND
TableA.Field2 = TableB.Field2 AND
TableA.Field3 = TableB.Field3 AND
TableA.Field4 = TableB.Field18
where the costly validations are:
(
CAST(
SUBSTRING(
Convert(nvarchar(50), TableA.MoneyField1),
2,
(len(Convert(nvarchar(50), TableA.MoneyField1)) - 1)
) as money
) = TableB.MoneyField2
AND
len(
Convert(
nvarchar(50),
CAST(
SUBSTRING(
Convert(nvarchar(50), TableA.MoneyField1),
2,
(len(Convert(nvarchar(50), TableA.MoneyField1)) - 1)
) as money
)
)
) = len(TableB.MoneyField2)
)
Validations ideas : We were removing a dollar sign ($) in our very old applications from some strings received by a third party application before converting the string into money. The third party application removed the $ sign and now we are removing a digit from our numbers (oops). The applications are fixed but our data is corrupted.
Question is: is the order in the ON
clause has any effect on the performance or optimization of the query? For example, if I put the costly validation at the end of the ON
clause, is there any changes (I don't see any on my side, I tried).
My assumption is No
since T-SQL is smarter than me.
so this:
ON
(
[costly validations]
)
AND
TableB.Date1 > '2016-01-01' AND
TableA.Date2 > '2016-01-01' AND
TableB.Date1 = TableA.Date2 AND
TableA.Field1 = TableB.Field1 AND
TableA.Field2 = TableB.Field2 AND
TableA.Field3 = TableB.Field3 AND
TableA.Field4 = TableB.Field18
versus this :
ON
TableB.Date1 > '2016-01-01' AND
TableA.Date2 > '2016-01-01' AND
TableB.Date1 = TableA.Date2 AND
TableA.Field1 = TableB.Field1 AND
TableA.Field2 = TableB.Field2 AND
TableA.Field3 = TableB.Field3 AND
TableA.Field4 = TableB.Field18 AND
(
[costly validations]
)
There is no guarantee on the order of evaluation. Both within a single ON
clause and across multiple ON
clauses and the WHERE
clause.1
All that is required by the standard is that certain clauses are evaluated in a particular logical order, with database systems free to re-order evaluation provided they produce results consistent with the logical order. SQL goes even further beyond that, in some cases though - leading to the situation that it's very difficult to write code with guaranteed guards (e.g. if you want to do a/b
and want to avoid the division by zero error, you have to jump through hoops to ensure you will not ever see that error. You can't just write b!=0 and a/b = <something>
)
1This is unlike a lot of other languages in which their boolean AND
and OR
operators are defined to perform short-circuiting, and only evaluate their right-hand expression if the left-hand expression is insufficient to determine the entire result.