In the beginning there was one set of tables for all third parties and db was nice.... then...
One of our 3rd parties needed some of their data exposed so to hide all the other customer data solution was to create another set of tables(!!!) where we would duplicate data just for that 3rd party and give the access only to their set of mirrored tables.... until another 3rd party asked for the same... and then another....and then I came in...
ATM I am thinking on introducing a set of Views
per 3rd party that sit in their own 3rd party Schema
and do sub-selects from original tables... To identify in new Views
which records to show to who I am planning on introducing RecordOwner
table. Where I would have PK and owner columns for the records that need to be shown in that particular set of views. (Other option would be to add to 'main' table that holds PK add Owner column and identify owner from there.... but I see this as less flexible...)
My Views would look something like this (names are only for illustration):
CREATE VIEW [3rdPartyTable]
AS
SELECT *
FROM [OriginalTable] ot
WHERE EXISTS (SELECT 1 FROM [RecordOwner]
WHERE [Id] = ot.[Id] AND [Owner] = 'Owner1')
CREATE VIEW [3rdPartyTable2]
AS
SELECT *
FROM [OriginalTable2] ot
WHERE EXISTS (SELECT 1 FROM [RecordOwner]
WHERE [Id] = ot.[MasterId] AND [Owner] = 'Owner1')
I was thinking about performance as there are quite a few million records in those original tables. When each 3rd party will run their queries and will join their views will SQL Server be smart enough to optimize the multiple identical repeated
where exists (select 1 from [RecordOwner]
where [Id] = ot.[MasterId] and [Owner] = 'Owner1')
or not?
EDIT: Consider the following query 3rd party dev writes:
SELECT * from [3rdPartyTable2] t2
inner join [3rdPartyTable3] t3 on t2.MasterId = t3.Masterid
this then would translate to
SELECT * FROM
(SELECT * FROM [OriginalTable2] ot
where exists (select 1 from [RecordOwner] where [Id] = ot.[MasterId] and [Owner] = 'Owner1')) t2
inner join
(SELECT * FROM [OriginalTable3] ot
where exists (select 1 from [RecordOwner] where [Id] = ot.[MasterId] and [Owner] = 'Owner1')) t3
on t2.MasterId = t3.Masterid
and both tables have PK on MasterId
.
Will SQL Server be smart enough to optimize multiple repeated EXISTS
clauses in views?
TL;DR; Views that are constructed using JOINS
are equal or faster then ones that are constructed using EXISTS
(and they do get optimised on LEFT OUTER JOINS
).
Just as a confirmation of @Gordon's answer I did experiment, did the views, indexes and filled in 93% of PK record data in database, you can see from execution plan that when joining 3 tables it did 3 scans....
The overhead (~3% per join) is minimal though...
EDIT: Ran another set of testing and execution plans are identical when joining views on LEFT OUTER JOIN
but when doing INNER JOINS
views that are constructed using JOIN
rather then EXISTS
outperform the latter...
Example of view with JOIN:
CREATE VIEW [dbo].[Summary2]
AS
SELECT st.*
FROM [OriginalSummary] st
inner join [RecordOwner] ro on ro.[Id] = st.[MasterId] and [Owner] = 'corp1'
Example of view with EXISTS:
CREATE VIEW [dbo].[Summary]
AS
SELECT *
FROM [OriginalSummary] ot
WHERE EXISTS (SELECT 1 FROM [RecordOwner]
WHERE [Id] = ot.[MasterId] AND [Owner] = 'corp1')
GO