sql-server oracle-database join inference

Join tables with Implied / Inferred data

This would be an easy join except: Table A is explicit for all times and values, but Table B only records rows when the there is a change from the previous value. In looking at Table B one can easily infer the missing times and values, but how to put that into a query?

Data in A.time contains every minute and a corresponding A.Value.

A.Time...........A.Value

9:00...............3.4

9:01...............5.0

9:02...............5.3

9:03...............5.3

9:04...............5.3

and so on…..

Table B only contains rows where the B.value has changed from the previous value.

B.Time..............B.Value

9:00...................4

9:01...................4.1

This is blank, but I know it to be 9:02 / 4.1

This is blank, but I know it to be 9:03 / 4.1

9:04....................4.7

and so on…

I need to do a query that links A.Time and B.Value, but I need the query to understand that a missing time in Table B should be substituted by the B.value of the first B.Time preceeding it.

Final table should be

A.Time...............B.Value

9:00...................4

9:01...................4.1

9:02...................4.1

9:03...................4.1

9:04...................4.7

I am currently writing this for SQL Server, but I need an Oracle solution too Thanks in advance;

Solution

In Oracle, you can LEFT JOIN to get all the times and then use LAST_VALUE(b.value) IGNORE NULLS... to fill in the blanks. (NOTE: the ROWS BETWEEN... part is redundant with the ORDER BY in the OVER() clause, but I like it for extra clarity).

Like this:

SELECT a.time,
       LAST_VALUE (b.VALUE)
         IGNORE NULLS
         OVER (PARTITION BY NULL 
               ORDER BY a.time 
               ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM   table_a a 
LEFT JOIN table_b b ON b.time = a.time
ORDER BY a.time;

Here is a full example with test data:

with table_a ( time, value ) as
  ( SELECT '9:00', 3.4 FROM DUAL UNION ALL
    SELECT '9:01', 5.0 FROM DUAL UNION ALL
    SELECT '9:02', 5.3 FROM DUAL UNION ALL
    SELECT '9:03', 5.3 FROM DUAL UNION ALL
    SELECT '9:04', 5.3 FROM DUAL ),
table_b ( time, value ) as 
  ( SELECT '9:00', 4 FROM DUAL UNION ALL
    SELECT '9:01', 4.1 FROM DUAL UNION ALL
    SELECT '9:04', 4.7 FROM DUAL )
SELECT a.time,
       LAST_VALUE (b.VALUE)
         IGNORE NULLS
         OVER (PARTITION BY NULL 
               ORDER BY a.time 
               ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM   table_a a 
LEFT JOIN table_b b ON b.time = a.time
ORDER BY a.time;

An alternative (which might work on SQL Server) is to use OUTER APPLY. Like so:

SELECT a.time, b.value
FROM   table_a a 
OUTER APPLY ( SELECT * 
              FROM table_b b 
              WHERE b.time <= a.time 
              ORDER BY b.time desc 
              FETCH FIRST 1 ROW ONLY ) b
ORDER BY a.time;

Basically, this finds the most recent non-null value from table B for each row in table A.

SQL*SERVER Solution

Here is the OUTER APPLY syntax translated to SQL*Server:

with table_a ( time, value ) as
  ( SELECT '9:00', 3.4 UNION ALL
    SELECT '9:01', 5.0 UNION ALL
    SELECT '9:02', 5.3 UNION ALL
    SELECT '9:03', 5.3 UNION ALL
    SELECT '9:04', 5.3 ),
table_b ( time, value ) as 
  ( SELECT '9:00', 4 UNION ALL
    SELECT '9:01', 4.1 UNION ALL
    SELECT '9:04', 4.7 )
SELECT a.time, b.value
FROM   table_a a OUTER APPLY ( 
        SELECT * FROM table_b b 
        WHERE b.time <= a.time 
        ORDER BY b.time desc 
        OFFSET 0 ROWS
        FETCH NEXT 1 ROWS ONLY ) b
ORDER BY a.time;