Search code examples
sqlsql-serverjoindatetime2

SQL select from multiple tables based on datetime


I am working on a script to analyze some data contained in thousands of tables on a SQL Server 2008 database.

For simplicity sakes, the tables can be broken down into groups of 4-8 semi-related tables. By semi-related I mean that they are data collections for the same item but they do not have any actual SQL relationship. Each table consists of a date-time stamp (datetime2 data type), value (can be a bit, int, or float depending on the particular item), and some other columns that are currently not of interest. The date-time stamp is set for every 15 minutes (on the quarter hour) within a few seconds; however, not all of the data is recorded precisely at the same time...

For example:

TABLE1:

TIMESTAMP                 VALUE
2014-11-27 07:15:00.390      1
2014-11-27 07:30:00.390      0
2014-11-27 07:45:00.373      0
2014-11-27 08:00:00.327      0

TABLE2:

TIMESTAMP                 VALUE
2014-11-19 08:00:07.880      0
2014-11-19 08:15:06.867      0.0979999974370003
2014-11-19 08:30:08.593      0.0979999974370003
2014-11-19 08:45:07.397      0.0979999974370003

TABLE3

TIMESTAMP                 VALUE
2014-11-27 07:15:00.390        0
2014-11-27 07:30:00.390        0
2014-11-27 07:45:00.373        1
2014-11-27 08:00:00.327        1

As you can see, not all of the tables will start with the same quarterly TIMESTAMP. Basically, what I am after is a query that will return the VALUE for each of the 3 tables for every 15 minute interval starting with the earliest TIMESTAMP out of the 3 tables. For the example given, I'd want to start at 2014-11-27 07:15 (don't care about seconds... thus, would need to allow for the timestamp to be +- 1 minute or so). Returning NULL for the value when there is no record for the particular TIMESTAMP is ok. So, the query for my listed example would return something like:

TIMESTAMP                 VALUE1   VALUE2             VALUE3
2014-11-27 07:15           1    NULL                  0
2014-11-27 07:30           0    NULL                  0
2014-11-27 07:45           0    NULL                  1
2014-11-27 08:00           0    NULL                  1
...
2014-11-19 08:00           0         0                        1
2014-11-19 08:15           0         0.0979999974370003       0
2014-11-19 08:30           0         0.0979999974370003       0
2014-11-19 08:45           0         0.0979999974370003       0

I hope this makes sense. Any help/pointers/guidance will be appreciated.


Solution

  • The first thing I would do is normalize the timestamps to the minute. You can do this with an update to the existing column

    UPDATE TABLENAME
      SET TIMESTAMP = dateadd(minute,datediff(minute,0,TIMESTAMP),0)
    

    or in a new column

    ALTER TABLE TABLENAME ADD COLUMN NORMTIME DATETIME;
    
    UPDATE TABLENAME
      SET NORMTIME = dateadd(minute,datediff(minute,0,TIMESTAMP),0)
    

    For details on flooring dates this see this post: Floor a date in SQL server


    The next step is to make a table that has all of the timestamps (normalized) that you expect to see -- that is every 15 -- one per row. Lets call this table TIME_PERIOD and the column EVENT_TIME for my examples (call it whatever you want).

    There are many ways to make such a table recursive CTE, ROW_NUMBER(), even brute force. I leave that part up to you.


    Now the problem is simple select with left joins and a filter for valid values like this:

    SELECT TP.EVENT_TIME, a.VALUE as VALUE1, b.VALUE as VALUE2, c.VALUE as VALUE3
    FROM  TIME_PERIOD TP
    LEFT JOIN TABLE1 a ON a.[TIMESTAMP] = TP.EVENT_TIME
    LEFT JOIN TABLE2 b ON b.[TIMESTAMP] = TP.EVENT_TIME
    LEFT JOIN TABLE3 c ON c.[TIMESTAMP] = TP.EVENT_TIME
    WHERE COALESCE(a.[TIMESTAMP], b.[TIMESTAMP], c.[TIMESTAMP]) is not null
    ORDER  BY TP.EVENT_TIME DESC 
    

    The where might get a little more complex if they are different types so you can always use this (which is not as good as coalesce but will always work):

    WHERE a.[TIMESTAMP] IS NOT NULL OR
          b.[TIMESTAMP] IS NOT NULL OR
          c.[TIMESTAMP] IS NOT NULL