Search code examples
sql-serversql-server-2012query-optimizationexistssql-view

Would multiple repeated EXISTS statements that are in multiple joined views get optimised in SQL Server


In the beginning there was one set of tables for all third parties and db was nice.... then...

One of our 3rd parties needed some of their data exposed so to hide all the other customer data solution was to create another set of tables(!!!) where we would duplicate data just for that 3rd party and give the access only to their set of mirrored tables.... until another 3rd party asked for the same... and then another....and then I came in...

ATM I am thinking on introducing a set of Views per 3rd party that sit in their own 3rd party Schema and do sub-selects from original tables... To identify in new Views which records to show to who I am planning on introducing RecordOwner table. Where I would have PK and owner columns for the records that need to be shown in that particular set of views. (Other option would be to add to 'main' table that holds PK add Owner column and identify owner from there.... but I see this as less flexible...)

My Views would look something like this (names are only for illustration):

CREATE VIEW [3rdPartyTable]
AS
   SELECT *
   FROM [OriginalTable] ot
   WHERE EXISTS (SELECT 1 FROM [RecordOwner] 
                 WHERE [Id] = ot.[Id] AND [Owner] = 'Owner1')

CREATE VIEW [3rdPartyTable2]
AS
    SELECT *
    FROM [OriginalTable2] ot
    WHERE EXISTS (SELECT 1 FROM [RecordOwner] 
                  WHERE [Id] = ot.[MasterId] AND [Owner] = 'Owner1')

I was thinking about performance as there are quite a few million records in those original tables. When each 3rd party will run their queries and will join their views will SQL Server be smart enough to optimize the multiple identical repeated

where exists (select 1 from [RecordOwner] 
              where [Id] = ot.[MasterId] and [Owner] = 'Owner1')

or not?

EDIT: Consider the following query 3rd party dev writes:

SELECT * from [3rdPartyTable2] t2 
inner join [3rdPartyTable3] t3 on t2.MasterId = t3.Masterid 

this then would translate to

SELECT * FROM
   (SELECT * FROM [OriginalTable2] ot 
   where exists (select 1 from [RecordOwner] where [Id] = ot.[MasterId] and [Owner] = 'Owner1')) t2
inner join 
   (SELECT * FROM [OriginalTable3] ot 
   where exists (select 1 from [RecordOwner] where [Id] = ot.[MasterId] and [Owner] = 'Owner1')) t3
on t2.MasterId = t3.Masterid 

and both tables have PK on MasterId.

Will SQL Server be smart enough to optimize multiple repeated EXISTS clauses in views?


Solution

  • TL;DR; Views that are constructed using JOINS are equal or faster then ones that are constructed using EXISTS (and they do get optimised on LEFT OUTER JOINS).

    Just as a confirmation of @Gordon's answer I did experiment, did the views, indexes and filled in 93% of PK record data in database, you can see from execution plan that when joining 3 tables it did 3 scans....

    enter image description here

    The overhead (~3% per join) is minimal though...

    EDIT: Ran another set of testing and execution plans are identical when joining views on LEFT OUTER JOIN but when doing INNER JOINS views that are constructed using JOIN rather then EXISTS outperform the latter...

    Example of view with JOIN:

    CREATE VIEW [dbo].[Summary2]
    AS
    SELECT st.*
      FROM [OriginalSummary] st
      inner join [RecordOwner] ro on ro.[Id] = st.[MasterId] and [Owner] = 'corp1'
    

    enter image description here

    Example of view with EXISTS:

    CREATE VIEW [dbo].[Summary]
    AS
    SELECT *
    FROM [OriginalSummary] ot
    WHERE EXISTS (SELECT 1 FROM [RecordOwner] 
                  WHERE [Id] = ot.[MasterId] AND [Owner] = 'corp1')
    GO
    

    enter image description here