Search code examples
sql-servert-sqlcross-applyouter-apply

Is there any difference between CROSS APPLY and OUTER APPLY when creating a cartesian product?


When creating a cartesian product between two tables, is there any difference between CROSS APPLY and OUTER APPLY?

This may seem like a silly question given that without a relationship expressed between the tables, the right-hand table can't fail to satisfy the relation, but I'm respectful of what I don't know.

When I look at the execution plans with a simple test setup, they're identical [two index seeks feeding into Nested Loops (Inner Join)], but simple test setups can be deceptive.

Here's an example of what I mean (SQL Fiddle). The setup:

CREATE TABLE dbo.First (
    Id      INT IDENTITY(1, 1) PRIMARY KEY,
    Name    NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
    INSERT INTO dbo.First (Name) VALUES ('First' + CONVERT(NVARCHAR(100), @n));
    SET @n = @n + 1;
END
GO
CREATE INDEX IX__First__Name ON dbo.First(Name);
GO
CREATE TABLE dbo.Second (
    Id      INT IDENTITY(1, 1) PRIMARY KEY,
    Name    NVARCHAR(100)
);
GO
DECLARE @n INT = 1;
WHILE @n < 10000
BEGIN
    INSERT INTO dbo.Second (Name) VALUES ('Second' + CONVERT(NVARCHAR(100), @n));
    SET @n = @n + 1;
END
GO
CREATE INDEX IX__Second__Name ON dbo.Second(Name);
GO

Using CROSS APPLY:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('Second6543', 'Second517');

Using OUTER APPLY:

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second                                 -- <== Only change is here
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('Second6543', 'Second517');

...both of which give me the expected four rows.

Plus various variations where either, or both, IN clauses return no matches:

-- No match in First
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('Second6543', 'Second517');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('Second6543', 'Second517');

-- No match in Second
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('no match');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('First253', 'First3304')
AND         Second.Name IN ('no match');

-- No match in either
SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
CROSS APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('no match');

SELECT      First.Id AS FirstId, Second.Id AS SecondId
FROM        First
OUTER APPLY Second
WHERE       First.Name IN ('no match')
AND         Second.Name IN ('no match');

...all of which give me the expected zero rows.


Solution

  • The difference comes into play when applied table or table-valued function has no records:

    SELECT      First.Id AS FirstId, Second.Id AS SecondId
    FROM        First
    OUTER APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
    WHERE       First.Name IN ('First253', 'First3304');
    
    2 rows returned
    
    
    SELECT      First.Id AS FirstId, Second.Id AS SecondId
    FROM        First
    CROSS APPLY (SELECT * FROM Second WHERE Second.Id = -1) Second
    WHERE       First.Name IN ('First253', 'First3304');
    
    0 rows returned
    

    In OP's own words:

    Not the way you're doing it, because conceptually you're filtering with WHERE after the APPLY (although the plans show the engine optimizing by doing it first); but if you explicitly filter first and then APPLY like this:

    SELECT      First.Id AS FirstId, FilteredSecond.Id AS SecondId
    FROM        First
    CROSS APPLY (SELECT Id FROM Second WHERE Name IN ('xxx')) FilteredSecond 
    WHERE       First.Name IN ('First253', 'First3304');
    

    you'd see the difference because you'd get rows with NULLs with the OUTER but no rows with the CROSS.