Search code examples
sql-servert-sqlcorrelated-subquery

What is the most efficient way to correlate a subquery on a big table in T-SQL?


I have a table that contains Make, Model, Serial Number, and Invoice Date of machine sales, and I want to pair that up with a table that contains Make, Serial Number, Recorded Usage, Usage Units, and Record Date - except that the Usage/Record Table is HUUUUUUGE and may not have a record for every machine.

I've tried writing an OUTER JOIN, but there's too much data in the Usage/Records table to make this operate efficiently. And I tried to write a CROSS APPLY, but I must have screwed something up, because that didn't seem to work very effectively, either.

Example of files:

My Base Query:

Inv. Date      Mk      Model      Serial
2019-03-29     AA      420D       0FDP09999
2019-03-21     AA      A19B-SSL   0DX240481

Usage/Records Table:

Mk      Serial      Usage      Units      Record Date
AA      0FDP09999   2345.0     H          2019-03-27
AA      0FDP09999   2349.2     H          2019-03-28
AA      0FDP09999   2351.8     H          2019-03-29
AA      0DX240481   0.0        H          2019-03-21
AA      0DX240481   24.0       H          2019-03-22

The output should be:

Inv. Date      Mk      Model      Serial      Usage      Units      Record Date
2019-03-29     AA      420D       0FDP09999   2351.8     H          2019-03-29
2019-03-21     AA      A19B-SSL   0DX240481   0.0        H          2019-03-21

... returning the Usage, Units, and Record Date of ONLY the most recent entry prior to the Invoice Date.

Any suggestions?


Solution

  • To solve this, I ended up creating additional queries outside what was originally my base query.

    On the first outer query, I did this ("Invoice Number" is an additional field I invoked to ensure a unique row-numbering, in case a machine was sold once, bought back, and then sold again within the time period):

    CASE
        WHEN Q1.[Usage] IS NULL
        THEN 1
        ELSE ROW_NUMBER() OVER (PARTITION BY Q1.[Serial Number], Q1.[Mk], Q1.[Invoice Number] ORDER BY Q1.[Record Date] DESC)
    END AS [RowNum]
    

    This ensures that every entry in the table has a sorting mechanism, even if there's no Usage measurement in the joined table.

    Then, the next outer query only grabbed the rows with RowNum = 1.