sql sql-server linq entity-framework-4 geonames

Slow performance on LINQ query against SQL server

I want to find geographic names that starts with characters entered in a search box. Some geographic names have alternative names in other languages. These alternative names are stored in a separate table.

GN_Name 1 - 0:N GN_AlternateName

(PK)GN_Name.GeoNameId == (FK)GN_AlternateName.GeoNameId

I want to search the name in GN_AlternateName.AlternateName first and if that doesn't exist, use the corresponding GN_Name.Name.

I wrote following LINQ query:

return (from name in db.GN_Name
        where name.CountryCode == "se"
        join alt in db.GN_AlternateName 
        on name.GeoNameId equals alt.GeoNameId into outer
        from alt in outer.DefaultIfEmpty()
        where ((alt.IsoLanguage == "sv" && 
                alt.AlternateName.StartsWith(query)) || 
                name.Name.StartsWith(query))
        select new GeoNameModel { 
            Language = alt.IsoLanguage, 
            Name = (alt == null ? name.Name : alt.AlternateName),
            FeatureClass = name.FeatureClass, 
            FeatureCode = name.FeatureCode, 
            GeoNameId = name.GeoNameId, 
            UniqueName = name.UniqueName,
            UniqueCount = name.UniqueCount}).Take(HB.AutoCompleteCount);

That translates into the following SQL:

exec sp_executesql N'SELECT 
[Limit1].[GeoNameId] AS [GeoNameId], 
[Limit1].[IsoLanguage] AS [IsoLanguage], 
[Limit1].[C1] AS [C1], 
[Limit1].[FeatureClass] AS [FeatureClass], 
[Limit1].[FeatureCode] AS [FeatureCode], 
[Limit1].[UniqueName] AS [UniqueName], 
[Limit1].[UniqueCount] AS [UniqueCount]
FROM ( SELECT TOP (5) 
    [Extent1].[GeoNameId] AS [GeoNameId], 
    [Extent1].[FeatureClass] AS [FeatureClass], 
    [Extent1].[FeatureCode] AS [FeatureCode], 
    [Extent1].[UniqueName] AS [UniqueName], 
    [Extent1].[UniqueCount] AS [UniqueCount], 
    CASE WHEN ([Extent2].[AlternateNameId] IS NULL) THEN [Extent1].[Name] ELSE [Extent2].[AlternateName] END AS [C1], 
    [Extent2].[IsoLanguage] AS [IsoLanguage]
    FROM  [dbo].[GN_Name] AS [Extent1]
    LEFT OUTER JOIN [dbo].[GN_AlternateName] AS [Extent2] ON [Extent1].[GeoNameId] = [Extent2].[GeoNameId]
    WHERE (''se'' = [Extent1].[CountryCode]) AND (((''sv'' = [Extent2].[IsoLanguage]) AND ([Extent2].[AlternateName] LIKE @p__linq__0 ESCAPE N''~'')) OR ([Extent1].[Name] LIKE @p__linq__1 ESCAPE N''~''))
)  AS [Limit1]',N'@p__linq__0 nvarchar(4000),@p__linq__1 nvarchar(4000)',@p__linq__0=N'ja%',@p__linq__1=N'ja%'

I can't really see whats wrong with it, but it takes around 5 seconds to complete.

Should i add some index? Maybe set up an indexed view? My SQL server knowledge is limited and i would love to get back to some real coding ;)

Any suggestions warmly appreciated!

UPDATE I'm using SQL server 2008. Following the instructions of taylonr i got the following results. Client statistics Execution plan Execution plan details

There are 3 "parts" that make up 100% of the total. I, however, don't have a clue on how to use these statistics.

UPDATE 2

SSMS Execution Plan recommended the following index:

CREATE NONCLUSTERED INDEX IX_GN_Name_CountryCode
ON [dbo].[GN_Name] ([CountryCode])
INCLUDE ([GeoNameId],[Name],[FeatureClass],[FeatureCode],[UniqueName],[UniqueCount])

I added it and the query now runs much better!

UPDATE 3 taylonr suggests using only one LIKE clause. I'm not sure how to accomplish this. Anyone up for the challenge?

Solution

First, I'd be careful about calling SQL not "real coding" since it looks like improvement there could help you out ;) (I'm a C# guy, and not a SQL expert, just sayin...)

Go in to your SSMS, and take the query that is generated.

Copy that into a new Query Window.

Now do 2 things before your run it. 1. Go into the Query menu and click "Include Client Statistics" 2. Go into the Query menu and click "Include Actual Execution plan"

Now run your query.

When the query is done, check the Client statistics for the item labeled "Wait time on server replies" this is the amount of time (in ms) that the server is executing for this query.

The "Total Execution time" is the amount of time it took the client & server to communicate the data.

That will give you an idea of what the time is like on the server. For example, if that's 10ms and it takes 5s to execute from your code, Sql might not be the problem.

Next, open your execution plan tab. This will show you how SQL generated this data. For example, if it spent 100% of the time doing a table scan (as opposed to an index scan) then you might want to add some indexes.

Take a look at the execution plan and see what has the highest percentage. This will give you an idea of where you might be able to optimize your query.

I would guess that having the two separate 'like' statements probably isn't helping much. Like statements aren't as performant as an equality, e.g.

WHERE name = 'taylonr'

is quicker than

WHERE name like 'taylo%'