So. Weird Situation. I have a Users table in a database I interact with. It looks like this:
EXTENSIONID USERID FIRSTNAME EXTENSION ts
448980 4389116109 Johnny Gorman 51980 2019-07-15 03:00:24
683597 4389116109 James Sutherland 51980 2019-08-08 14:32:00
683598 4399610820 Sara Sarason 51802 2019-08-12 15:38:21
683599 4389116109 James Sutherland 51980 2019-08-12 16:32:32
Basically, USERID is a foreign key to our Employee Timekeeping tables and, problematically, it is not unique to one User: if we don't grant a new User a new USERID when we assign them their EXTENSION, they'll just take over the USERID of whoever the last holder of that EXTENSION was (as you see happening with Johnny Gorman and James Sutherland above).
In addition, as you can see happening with James Sutherland, a specific User can have multiple rows with the same USERID. The EXTENSIONID is the primary key of the table, and it is unique, but it is the foreign key to a different set of tables.
The timestamp field of the table is was last useful. So, in otherwords, the timestamp demarks that everything less than or equal to that timestamp belonged to that row until the prior timestamp.
In other words, to get my Users table attached to my timekeeping data, I used to have to do a correlated subquery based on the timestamp, like so:
select * from timekeeping_tables as tt
inner join USERS as u on tt.USERID = u.USERID
and u.ts = (select max(ts) from USERS where ts <= tt.ENDTIME and USERID = u.USERID)
This worked fine in all respects except speed: doing a correlated subquery on such a large dataset took forever. So instead, I came up with a method to create a list of Effective and Ineffective Dates for each row in the USERS table so that joining would be a trivial equality match on the USERID and a between clause on the Effective/Ineffective Dates. It looks like this:
select
mx.FIRSTNAME as 'Name',
mx.EXTENSION as 'Extension',
mx.USERID,
min(mx.Start_Time) as 'AGP_Start_Date',
max(mx.End_Time) as 'AGP_End_Date'
from (
select
m3.FIRSTNAME,
m3.EXTENSION,
m3.Start_Time,
case
when @lastUSER <> m3.USERID
then subdate(adddate(curdate(),interval 1 day), interval 1 second)
else m3.End_Time
end as 'End_Time',
m3.ts,
m3.USERID,
@lastUSER := m3.USERID
from (
select
m2.FIRSTNAME,
m2.EXTENSION,
m2.USERID,
m2.ts,
case
when @lastUSER <> m2.USERID
then '2010-01-01 00:00:00'
else @lastTS
end as 'Start_Time',
m2.ts as 'End_Time',
@lastUSER := m2.USERID,
@lastNAME := m2.FIRSTNAME,
@lastTS := m2.ts
from (
select
m1.EXTENSIONID,
m1.FIRSTNAME,
m1.EXTENSION,
m1.USERID,
convert_tz(m1.ts,'UTC','America/Denver') as ts
from
mxuser as m1
order by
USERID asc,
ts asc
) as m2,
(select @lastUSER := 0,
@lastNAME := '',
@lastTS := '0000-00-00 00:00:00') as SQLVars
order by
m2.USERID asc,
End_Time desc,
Start_Time desc
) as m3,
(select @lastUSER := 0) as SQLVars
order by
m3.USERID asc,
m3.End_Time desc,
m3.Start_Time desc
The result set of this query, when used with my example from the beginning would produce an output like this:
Name Extension USERID AGP_Start_Date AGP_End_Date
Johnny Gorman 51980 4389116109 2010-01-01 00:00:00 2019-07-15 03:00:24
James Sutherland 51980 4389116109 2019-07-15 03:00:24 2019-08-08 14:32:00
James Sutherland 51980 4389116109 2019-08-08 14:32:01 2019-08-13 23:59:59
Sara Sarason 51802 4399610820 2010-01-01 00:00:00 2019-08-13 23:59:59
A nice, sorted list that's trivial to join to. There's no overlap between the effective and ineffective dates for people who share the same USERID: When I perform the join, I'm confident I'm joining to the one and only one Users row that was in effect for the timekeeping rows I'm joining to.
So long as I run it as a standalone query, that is.
When I run it inside my timekeeping stored procedure, it produces results like:
Name Extension USERID AGP_Start_Date AGP_End_Date
Johnny Gorman 51980 4389116109 2016-01-01 00:00:00 2019-08-08 14:32:00
James Sutherland 51980 4389116109 2019-07-15 03:00:24 2019-08-08 14:32:00
James Sutherland 51980 4389116109 2019-08-08 14:32:01 2019-08-13 23:59:59
Sara Sarason 51802 4399610820 2010-01-01 00:00:00 2019-08-13 23:59:59
God, what? Now the date ranges overlap one another, which'll duplicate rows when I join this table to my timekeeping table. But the weird thing is that it will produce the same wrong result consistently.
Why is this happening? It seems to be something to do with the session variables I'm using to grab the values from the previous row for calculation in the current row; like they're calculating out of sync, or not getting refreshed properly between rows. Even when I kill my session and restart, the same thing happens.
(And before you ask, no CTEs aren't an option for me, unfortunately. :(
So, the answer was to wrap the User date-range query in its own Stored Procedure and call it within the timekeeping Stored Procedure. It works as intended now.
I have no idea why.