I have been struggling with this for a while.
I have a database with three tables (each of which has millions of records) as follows (removed some columns for simplicity):
1.[Entity]
[Id] UNIQUEIDENTIFIER PK,
[EntityLevel_Id] UNIQUEIDENTIFIER NOT NULL FK [EntityLevel] ([Id])
2.[EntityData]
[Id] UNIQUEIDENTIFIER PK,
[Entity_Id] UNIQUEIDENTIFIER NOT NULL FK [Entity] ([Id]),
[DataLanguage_Id] UNIQUEIDENTIFIER NOT NULL FK [Language] ([Id]),
[Code] NVARCHAR (250) NOT NULL
3.[EntityLevel]
[Id] UNIQUEIDENTIFIER PK,
[Sort] INT NOT NULL
There exists indexes As follows
[IX_Entity_EntityLevelId] ON [Entity] ([EntityLevel_Id])
[IX_EntityData_EntityId] ON [EntityData] ([Entity_Id])
[IX_EntityData_DataLanguageId_Code] ON [EntityData] ([DataLanguage_Id], [Code])
[IX_EntityLevel_Sort] ON [EntityLevel] ([Sort])
To eliminate the possibility that the slowness is because of the selected columns, I only select a fixed value
The following query runs very fast (less than 1 second):
SELECT TOP 20
1
FROM
[Entity]
INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
[EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'
The following query also runs very fast:
SELECT TOP 20
1
FROM
[Entity]
INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
[EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'
ORDER BY
[EntityData].[Code] ASC
And the following as well runs fast:
SELECT TOP 20
1
FROM
[Entity]
INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
[EntityLevel].[Sort] = 1
BUT, the following query runs VERY SLOW (roughly 10 seconds):
SELECT TOP 20
1
FROM
[Entity]
INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
[EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'
AND
[EntityLevel].[Sort] = 1
ORDER BY
[EntityData].[Code]
I don't know the reason, and I don't find any way to apply any more indexes so the query runs faster
Any help is appreciated!
Edit: the following queries also runs fast:
SELECT TOP 20
1
FROM
[Entity]
INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
[EntityData].[DataLanguage_Id] = 'B6930015-F177-4ED3-97B0-AAEF401F9265'
AND
[EntityLevel].[Sort] = 1
And
SELECT TOP 20
1
FROM
[Entity]
INNER JOIN [EntityData] ON [Entity].[Id] = [EntityData].[Entity_Id]
INNER JOIN [EntityLevel] ON [Entity].[EntityLevel_Id] = [EntityLevel].[Id]
WHERE
[EntityLevel].[Sort] = 1
ORDER BY
[EntityData].[Code]
The problem appears only with order by and the two filters
For anyone who is concerned, this is the solution to the problem:
It turned out that with that much rows, what was taking time is to make key lookups and hash matches due to different indexes, so the solution was to add the following index to allow using of a single index on the table for all filter (and order by) values:
CREATE NONCLUSTERED INDEX [IX_EntityData_EntityId_DataLanguageId_IncCode] ON [EntityData] ([Entity_Id], [DataLanguage_Id]) INCLUDE ([Code])
But since I may also filter by code in some situations and not only use order by, so I modified the index to be like this:
CREATE NONCLUSTERED INDEX [IX_EntityData_EntityId_DataLanguageId_Code] ON [EntityData] ([Entity_Id], [DataLanguage_Id], [Code])