I thought CTE was kinda like a temporary table where I could reference to from the following query.
I'm confused on the error I am getting ONLY when using the WHERE
of the last query.
Certifiedby
field has name:date:id:
.
I am checking to see if certifiedby
is NULL
, if so make CERT_DATE
=1/1/1970, otherwise extract the date from Certifiedby
into CERT_DATE
.
If I run the first part of the query up to SELECT * FROM Certified
I get the expected results and the CERT_DATE
are all correct.
But when I add/execute the WHERE CERT_DATE BETWEEN '2/1/2023' AND '2/22/2023'
I get the error
Invalid length parameter passed to the LEFT or SUBSTRING function.
I thought once Certified
was created I could reference it's data? From the error it seems it's still trying to extract the date?
Clues? thank you
WITH Certified AS (
select
certifiedBy,
CASE WHEN certifiedBy IS NULL THEN '1/1/1970' ELSE SUBSTRING(certifiedBy, CHARINDEX(':', certifiedBy, 1) + 1, CHARINDEX(':', certifiedBy,
CHARINDEX(':', certifiedBy, 1) + 1) - CHARINDEX(':', certifiedBy, 1) - 1)
END AS CERT_DATE
From dbo.PO_Orders
WHERE poreference = 'shev'
)
SELECT * FROM Certified
WHERE CERT_DATE BETWEEN '2/1/2023' AND '2/22/2023'
ORDER BY CERT_DATE
I'm expecting to get back all records with CERT_DATE
within the date range.
The problem is the CERT_DATE
column in the CTE IS NOT A DATE. It's a STRING that happens to look kinda like a date to people with certain cultural backgrounds... but the computer and database only see the string. Therefore the BETWEEN
operator is doing an ordinal comparison between strings, rather than date comparisons.
There's an important concept to understand about date and datetime values when working with computers:
Cultural/internationalization issues mean converting to and from strings and date (or numeric) values is far more expensive and error-prone than we generally expect. It's something to avoid as much as possible.
This has a number of implications:
Date and Datetime values in the server are not stored as strings at all internally; it is not a human-readable format
Using strings to do date operations is generally the slowest and least-reliable option
There are certain preferred formats for date literals in your code that often do not match your own cultural background. On SQL Server you have these three options, and you should not deviate from them:
yyyy-MM-ddTHH:mm:ss[.fff]
yyyyMMdd HH:mm:ss[.fff]
yyyyMMdd
With that in mind, we can re-write the original code like this to follow better practice and get expected results:
WITH Certified AS (
SELECT certifiedBy,
coalesce(datefromparts(
right(SUBSTRING(CertifiedBy, 1, charindex(':', certifiedby, charindex('/', certifiedby))-1),4), -- year
REPLACE(SUBSTRING(certifiedBy, CHARINDEX(':', certifiedBy, 1) + 1, 2),'/',''), -- month
REPLACE(SUBSTRING(certifiedBy, CHARINDEX('/', certifiedBy, 1) + 1, 2),'/','') -- day
),'19700101') as CERT_DATE
FROM dbo.PO_Orders
WHERE poreference = 'shev'
)
SELECT *
FROM Certified
WHERE CERT_DATE >= '20230201' AND CERT_DATE < '20230223'
ORDER BY CERT_DATE
See it here:
Of course, even better if you can update the application to capture this as a real datetime column at INSERT/UPDATE time. That will perform better by multiple orders of magnitude.