Search code examples
sqlsql-servert-sqlunpivotcross-apply

Cross apply not performant, convert to unpivot (or other)?


Cross apply is very slow when inserting large data sets, I think unpivot (or something else) might be more efficient but I can't figure out how to do it other than by using cross apply:

CREATE TABLE LocData
    (Airport varchar(5), City varchar(5), Country varchar(5))
;

INSERT INTO LocData
    (Airport, City, Country)
VALUES
    ('LHR','LON','GB'),
    ('LGW','LON','GB'),
    ('LCY','LON','US'),
    ('JFK','NYC','US'),
    ('LGA','NYC','US'),
    ('EWR','NYC','US')
;

CREATE TABLE Travel
    (ID int, Dep varchar(5), Arr varchar(5))
;

INSERT INTO Travel
    (ID, Dep, Arr)
VALUES
    (1, 'LHR','JFK'),
    (2, 'LHR','EWR'),
    (3, 'LGA','LGW'),
    (4, 'LCY','EWR')
;

--the query

select * from Travel;

select 
    c.*
    from Travel t
    join LocData dep on dep.Airport=t.Dep
    join LocData arr on arr.Airport=t.Arr
    cross apply
    (
    values 
    (t.ID, dep.Airport, 0, arr.Airport, 0),
    (t.ID, dep.City, 1, arr.City, 1),
    (t.ID, dep.Country, 2, arr.Country, 2)
    ) c (ID, Dep, DepType, Arr, ArrType);

You can test it here (SQLFiddle)


Solution

  • Your query is:

    select c.*
    from Travel t join
         LocData dep
         on dep.Airport = t.Dep join
         LocData arr
         on arr.Airport = t.Arr cross apply
         (values (t.ID, dep.Airport, 0, arr.Airport, 0),
                 (t.ID, dep.City, 1, arr.City, 1),
                 (t.ID, dep.Country, 2, arr.Country, 2)
        ) c(ID, Dep, DepType, Arr, ArrType);
    

    This seems like a very efficient way to do what you want to do. It should have good performance.

    One thing that will improve performance is an index on Locadata(airport). You can also include the columns city and country (either include them or make them additional keys). I would define the table using it as a primary key (which automatically creates the best type of index):

    CREATE TABLE LocData (
        Airport varchar(5) PRIMARY KEY,
        City varchar(5),
        Country varchar(5)
    );
    

    (Making the reasonable assumption that it is never NULL and never duplicated.)

    In other words, I sincerely doubt that the apply affects performance. The joins are much more likely to be the culprit.