Search code examples
sqlperformancesql-server-cesql-server-ce-4

how to fix this query performance on SQL Server Compact Edition 4


i have the following SQL Query which runs on SQL Server CE 4

SELECT [Join_ReleaseMinDatePost].[FK_MovieID]
  FROM ( 
          SELECT [FK_MovieID], MIN([DatePost]) AS [ReleaseMinDatePost] 
          FROM [Release] 
          GROUP BY [FK_MovieID] 
        ) [Join_ReleaseMinDatePost]
  INNER JOIN  
        ( 
          SELECT COUNT([ID]) AS [FolderCount], [FK_MovieID] 
          FROM [MovieFolder] 
          GROUP BY [FK_MovieID] 
        )  [Join_MovieFolder]
    ON [Join_MovieFolder].[FK_MovieID] = [Join_ReleaseMinDatePost].[FK_MovieID]

this query takes a long time to execute but if i change the Part

SELECT COUNT([ID]) AS [FolderCount], [FK_MovieID] FROM [MovieFolder] GROUP BY [FK_MovieID]

To

SELECT 1 AS [FolderCount], [FK_MovieID] FROM [MovieFolder]

So the full query becomes

SELECT [Join_ReleaseMinDatePost].[FK_MovieID]
FROM ( SELECT [FK_MovieID], MIN([DatePost]) AS [ReleaseMinDatePost] FROM [Release] GROUP BY [FK_MovieID] ) [Join_ReleaseMinDatePost]
INNER  JOIN  (SELECT 1 AS [FolderCount], [FK_MovieID] FROM [MovieFolder] ) [Join_MovieFolder]
ON [Join_MovieFolder].[FK_MovieID] = [Join_ReleaseMinDatePost].[FK_MovieID]

then the performance becomes very fast.
the problem is that the part that was changed if taken by itself is pretty fast. but for some reason the execution plan of the first query shows that the "actual number of rows" in the index scan is 160,016 while the total number of rows in the table MovieFolder is 2,192. and the "Estimated number of rows" is 2,192.
so i think the problem is in the number of rows but i cant figure out why its all messed up.
any help will be appreciated :) thanks

the schema of the tables is below

CREATE TABLE [Release] (
  [ID] int NOT NULL
, [FD_ForumID] int NOT NULL
, [FK_MovieID] int NULL
, [DatePost] datetime NULL
);
GO
ALTER TABLE [Release] ADD CONSTRAINT [PK__Release__0000000000000052] PRIMARY KEY ([ID]);
GO
CREATE INDEX [IX_Release_DatePost] ON [Release] ([DatePost] ASC);
GO
CREATE INDEX [IX_Release_FD_ForumID] ON [Release] ([FD_ForumID] ASC);
GO
CREATE INDEX [IX_Release_FK_MovieID] ON [Release] ([FK_MovieID] ASC);
GO
CREATE UNIQUE INDEX [UQ__Release__0000000000000057] ON [Release] ([ID] ASC);
GO

CREATE TABLE [MovieFolder] (
  [ID] int NOT NULL  IDENTITY (1,1)
, [Path] nvarchar(500) NOT NULL
, [FK_MovieID] int NULL
, [Seen] bit NULL
);
GO
ALTER TABLE [MovieFolder] ADD CONSTRAINT [PK_MovieFolder] PRIMARY KEY ([ID]);
GO
CREATE INDEX [IX_MovieFolder_FK_MovieID] ON [MovieFolder] ([FK_MovieID] ASC);
GO
CREATE INDEX [IX_MovieFolder_Seen] ON [MovieFolder] ([Seen] ASC);
GO
CREATE UNIQUE INDEX [UQ__MovieFolder__0000000000000019] ON [MovieFolder] ([ID] ASC);
GO
CREATE UNIQUE INDEX [UQ__MovieFolder__0000000000000020] ON [MovieFolder] ([Path] ASC);
GO

Solution

  • I think you're running into a correlated subquery problem. The query part you're experimenting with is part of a JOIN condition, so it is fully evaluated for every potentially matching row. You're making your SQL engine do the second 'GROUP BY' for every row produced by the FROM clause. So it's reading 2192 rows to do the group by for each and every row produced by the FROM clause.

    This suggest you're getting 73 rows in the FROM clause grouping (2192 * 73 = 160 016)

    When you change it to do SELECT 1, you eliminate the table-scan read for grouping.