Search code examples
mysqlsqlstored-proceduresdatabase-administration

Temporary Table in Stored Procedure is returning different values than in regular query


So. Weird Situation. I have a Users table in a database I interact with. It looks like this:

EXTENSIONID   USERID       FIRSTNAME         EXTENSION   ts
448980        4389116109   Johnny Gorman     51980       2019-07-15 03:00:24
683597        4389116109   James Sutherland  51980       2019-08-08 14:32:00
683598        4399610820   Sara Sarason      51802       2019-08-12 15:38:21
683599        4389116109   James Sutherland  51980       2019-08-12 16:32:32

Basically, USERID is a foreign key to our Employee Timekeeping tables and, problematically, it is not unique to one User: if we don't grant a new User a new USERID when we assign them their EXTENSION, they'll just take over the USERID of whoever the last holder of that EXTENSION was (as you see happening with Johnny Gorman and James Sutherland above).

In addition, as you can see happening with James Sutherland, a specific User can have multiple rows with the same USERID. The EXTENSIONID is the primary key of the table, and it is unique, but it is the foreign key to a different set of tables.

The timestamp field of the table is was last useful. So, in otherwords, the timestamp demarks that everything less than or equal to that timestamp belonged to that row until the prior timestamp.

In other words, to get my Users table attached to my timekeeping data, I used to have to do a correlated subquery based on the timestamp, like so:

select * from timekeeping_tables as tt
inner join USERS as u on tt.USERID = u.USERID 
    and u.ts = (select max(ts) from USERS where ts <= tt.ENDTIME and USERID = u.USERID)

This worked fine in all respects except speed: doing a correlated subquery on such a large dataset took forever. So instead, I came up with a method to create a list of Effective and Ineffective Dates for each row in the USERS table so that joining would be a trivial equality match on the USERID and a between clause on the Effective/Ineffective Dates. It looks like this:

select
    mx.FIRSTNAME as 'Name',
    mx.EXTENSION as 'Extension',
    mx.USERID,
    min(mx.Start_Time) as 'AGP_Start_Date',
    max(mx.End_Time) as 'AGP_End_Date'
from (
    select
        m3.FIRSTNAME,
        m3.EXTENSION,
        m3.Start_Time,
        case
            when @lastUSER <> m3.USERID
                then subdate(adddate(curdate(),interval 1 day), interval 1 second)
            else m3.End_Time
        end as 'End_Time',
        m3.ts,
        m3.USERID,
        @lastUSER := m3.USERID
    from (
        select
            m2.FIRSTNAME,
            m2.EXTENSION,
            m2.USERID,
            m2.ts,
            case
                when @lastUSER <> m2.USERID
                    then '2010-01-01 00:00:00'
                else @lastTS
            end as 'Start_Time',
            m2.ts as 'End_Time',
            @lastUSER := m2.USERID,
            @lastNAME := m2.FIRSTNAME,
            @lastTS := m2.ts
        from (
            select 
                m1.EXTENSIONID,
                m1.FIRSTNAME,
                m1.EXTENSION,
                m1.USERID,
                convert_tz(m1.ts,'UTC','America/Denver') as ts
            from
                mxuser as m1
            order by
                USERID asc,
                ts asc
        ) as m2,
            (select @lastUSER := 0,
            @lastNAME := '',
            @lastTS := '0000-00-00 00:00:00') as SQLVars
        order by
        m2.USERID asc,
        End_Time desc,
        Start_Time desc
    ) as m3,
        (select @lastUSER := 0) as SQLVars
    order by
        m3.USERID asc,
        m3.End_Time desc,
        m3.Start_Time desc

The result set of this query, when used with my example from the beginning would produce an output like this:

Name             Extension USERID      AGP_Start_Date      AGP_End_Date
Johnny Gorman    51980     4389116109  2010-01-01 00:00:00 2019-07-15 03:00:24
James Sutherland 51980     4389116109  2019-07-15 03:00:24 2019-08-08 14:32:00
James Sutherland 51980     4389116109  2019-08-08 14:32:01 2019-08-13 23:59:59
Sara Sarason     51802     4399610820  2010-01-01 00:00:00 2019-08-13 23:59:59

A nice, sorted list that's trivial to join to. There's no overlap between the effective and ineffective dates for people who share the same USERID: When I perform the join, I'm confident I'm joining to the one and only one Users row that was in effect for the timekeeping rows I'm joining to.

So long as I run it as a standalone query, that is.

When I run it inside my timekeeping stored procedure, it produces results like:

Name             Extension USERID      AGP_Start_Date      AGP_End_Date
Johnny Gorman    51980     4389116109  2016-01-01 00:00:00 2019-08-08 14:32:00
James Sutherland 51980     4389116109  2019-07-15 03:00:24 2019-08-08 14:32:00
James Sutherland 51980     4389116109  2019-08-08 14:32:01 2019-08-13 23:59:59
Sara Sarason     51802     4399610820  2010-01-01 00:00:00 2019-08-13 23:59:59

God, what? Now the date ranges overlap one another, which'll duplicate rows when I join this table to my timekeeping table. But the weird thing is that it will produce the same wrong result consistently.

Why is this happening? It seems to be something to do with the session variables I'm using to grab the values from the previous row for calculation in the current row; like they're calculating out of sync, or not getting refreshed properly between rows. Even when I kill my session and restart, the same thing happens.

(And before you ask, no CTEs aren't an option for me, unfortunately. :(


Solution

  • So, the answer was to wrap the User date-range query in its own Stored Procedure and call it within the timekeeping Stored Procedure. It works as intended now.

    I have no idea why.