I have a stored procedure which is building a dynamic sql query and then running it via exec(@sql).
The stored proc is joining about 12 tables. As it was, it was running relatively quickly. But then i needed to added in an additional field. To do this, i created a scalar function, which looks like this:
SELECT @weight = @weight +COUNT(*) FROM dbo.UserPDMedication WHERE UserID = @userid
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND HoehnYarhID IS NOT null
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND DateOfBirth IS NOT NULL
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND GenderID IS NOT NULL
SELECT @weight = @weight +COUNT(*) FROM dbo.[User] WHERE UserID = @userid AND DateDiagnosed IS NOT null
It's basically just a function that will return an int based on how many questions a user has filled out. So for each user in the stored proc, this function gets called. The stored proc looks like this:
SELECT DISTINCT u.UserID, u.Healthy, u.DateOfBirth, u.City, st.StateCode AS State, u.GenderID, g.Gender, u.Latitude, u.Longitude, u.PDConditionID, u.Zip, u.Distance,
(SELECT TOP 1 EmailID FROM Messages m WHERE TrialID = ' + @trialID + ' AND ToUserID = u.userid AND LocationID = ' + @locationID + ') AS MessageID, dbo.UserWeightedValue(u.UserID) as wt
FROM [User] u
INNER JOIN aspnet_UsersInRoles uir ON u.AspnetUserID = uir.UserId
INNER JOIN aspnet_Roles r ON uir.RoleId = r.RoleId
FULL JOIN UserHealthCondition uhc ON u.UserID = uhc.UserID
FULL JOIN UserMotorSymptom ums ON u.UserID = ums.UserID
FULL JOIN UserNonMotorSymptom unms ON u.UserID = unms.UserID
FULL JOIN UserPDMedication updm ON u.UserID = updm.UserID
FULL JOIN UserPDTreatment updt ON u.UserID = updt.UserID
FULL JOIN UserSupplement us ON u.UserID = us.UserID
FULL JOIN UserPDGeneticMarker updgm ON u.UserID = updgm.UserID
FULL JOIN UserFamilyMember ufm ON u.UserID = ufm.UserID
FULL JOIN State st ON u.StateID = st.ID
FULL JOIN Gender g ON u.GenderID = g.ID
WHERE u.UserID IS NOT NULL
(i removed some chunks to try and keep this short). This get's executed as a dynamic string in the stored proc. Any tips on how i can optimize this to speed things up?
Thanks
EDIT: i got this working using a combination of suggestions here. I kept my function as is although i combined the multiple select statements into 2 statements.I then took the original stored proc and changed the select to a select into ##temp. And then i ran my function against that temp table. Execution time dropped down to 3-4 seconds. I think I will have to give credit to grant for this question since it was his pointing out distinct that put me on the right trail. But thank you to everyone.
The DISTINCT is absolutely going to cause a performance hit as it does aggregations. Do you really need it? Frequently when you see DISTINCT it's an indication of a data or structural issue that is getting papered over by the ability to eliminate duplicates that the structure should elminate on it's own.
After that, instead of a correlated query in the SELECT list, I'd look to move that as a JOIN. It's not a sure fire win, but frequently the optimizer is better able to work that into the plan.
Based on the complexity of what you're presenting, I'd also look at the execution plan. First thing to check, do you have a full optimization or did it timeout. If it timed out, then you're dealing with a best guess, not a fully calculated "good enough" plan. If that's so, you need to look at simpllifying this query. If you have a good enough plan, see where the bottlenecks are within it.