I would like to know if it's possible for expressions that are part of the SELECT
statement list to be evaluated for rows not matching the WHERE
clause?
From the execution order documented here, it seems that the SELECT
gets evaluated long after the WHERE
, however I ran into a very weird problem with a real-life query similar to the query below.
To put you in context, in the example, the SomeOtherTable
has a a_varchar
column which always contains numerical values for the code
105, but may contain non-numerical values for other codes.
The query statement works:
SELECT an_id, an_integer FROM SomeTable
UNION ALL
SELECT an_id, CAST(a_varchar AS int)
FROM SomeOtherTable
WHERE code = 105
The following query complains about being unable to cast a_varchar
to int
:
SELECT 1
FROM (
SELECT an_id, an_integer FROM SomeTable
UNION ALL
SELECT an_id, CAST(a_varchar AS int)
FROM SomeOtherTable
WHERE code = 105
) i
INNER JOIN AnotherOne a
ON a.an_id = i.an_id
And finally, the following query works:
SELECT 1
FROM (
SELECT an_id, an_integer FROM SomeTable
UNION ALL
SELECT
an_id,
CASE code WHEN 105 THEN CAST(a_varchar AS int) ELSE NULL END
FROM SomeOtherTable
WHERE code = 105
) i
INNER JOIN AnotherOne a
ON a.an_id = i.an_id
Therefore, the only explanation I could find was that with the JOIN
, the query gets optimized differently in a way that CAST(a_varchar AS int)
gets executed even if code <> 105
.
The queries are run against SQL SERVER 2008.
Absolutely.
The documentation that you reference has a section called Logical Processing Order of the SELECT statement. This is not the physical processing order. It explains how the query itself is interpreted. For instance, an alias defined in the select
clause cannot be references in the where
clause, because the where
clause is logically processed first.
In fact, SQL Server has the ability to optimize queries by doing various data transformation operations when it reads the data. This is a nice performance benefit, because the data is in memory, locally, and the operations can simply be done in place. However, the following can fail with a run-time error:
select cast(a_varchar as int)
from table t
where a_varchar not like '%[^0-9]%';
The filter is applied after the attempt at conversion, in the real process flow. I happen to consider this a bug; presumably, the folks at Microsoft do not think so, because they have not bothered to fix this.
Two workarounds are available. The first is try_convert()
, which does conversions and returns NULL
for a failure instead of a run-time error. The second is the case
expression:
select (case when a_varchar not like '%[^0-9]%' then cast(a_varchar as int) end)
from table t
where a_varchar not like '%[^0-9]%';