I have the following code that runs a stored procedure repeatedly. It works pretty well when I run the SQL statement literally, so I created a stored procedure that encapsulated what I was doing.
foreach (string worker in workers)
{
_gzClasses.ExecuteCommand("EXEC dbo.Session_Aggregate @workerId = {0}, @timeThresh = {1}", worker, SecondThreshold);
Console.WriteLine("Inserted sessions for {0}", worker);
}
Then, I wanted to know how many rows each call was generating, so I changed the SP slightly to return @@rowcount
as an output parameter. I can't use the DataContext to execute commands with output parameters, so I had to change the above code inside the for loop to the following:
using (var cn = new SqlConnection(CnStr))
{
cn.Open();
using (var cmd = new SqlCommand("Session_Aggregate",
cn) {CommandTimeout = 300})
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.AddWithValue("@workerId", worker);
cmd.Parameters.AddWithValue("@timeThresh", SecondThreshold);
SqlParameter sessions = cmd.Parameters.Add("@sessions", SqlDbType.Int);
sessions.Direction = ParameterDirection.Output;
cmd.ExecuteNonQuery();
Console.WriteLine("Inserted {1} sessions for {0}", worker, sessions.Value);
}
}
This works, but it runs MUCH slower than the other query. I thought it might be a case of parameter sniffing, so I changed it to CommandType.Text
and used the string EXEC Session_Aggregate ... WITH RECOMPILE
. But in that case, I keep getting the error that the out parameter @session
is not defined. In any case, the query barely runs now, even though the SQL command runs in < 1 second in SSMS.
Here's the stored procedure, in case anyone can help figure out what is going on, or can figure out a way to speed things up. I would also take pointers for how to properly profile what is going on here. With CommandType.StoredProcedure
I can't even see the actual command that is sent to SQL by VS.
PROCEDURE [dbo].[Session_Aggregate]
-- Add the parameters for the stored procedure here
@workerId varchar(64) = 0,
@timeThresh dateTime = '13 July 2007 11:27:46'
@sessions INT OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
INSERT INTO e_activeSessions
SELECT *
FROM (
SELECT workerId, startTime, COUNT(*) as totalTasks, MAX(timeInSession) as totalTime,
MIN(dwellTime) as minDwell, MAX(dwellTime) as maxDwell, AVG(dwellTime) as avgDwell, STDEV(dwellTime) as stdevDwell,
SUM(CAST(wrong80 as INT)) + SUM(CAST(correct80 as INT)) as total80, SUM(CAST(correct80 as INT)) as correct80,
SUM(CAST(correct80 as FLOAT)) / NULLIF(SUM(CAST(wrong80 as INT)) + SUM(CAST(correct80 as INT)), 0 ) as percent80
FROM (
SELECT *, (SELECT MAX(timeStamp)
FROM workerLog w where dwellTime is null AND timeInSession = 0 AND workerId = @workerId AND w.timeStamp <= workerLog.timeStamp
AND w.timeStamp >= @timeThresh) as startTime
FROM workerLog where workerId = @workerId) t
GROUP BY startTime, workerId) f
WHERE startTime is NOT NULL AND f.totalTasks > 1 AND totalTime > 0;
SET @sessions = @@ROWCOUNT;
END
EDIT: regardless of the execution plan for the original query, it was sped up significantly by creating a temporary table. I thought that SQL would have done this by analyzing the query, but I was probably wrong. Also, I found out about the OPTIMIZE FOR UNKNOWN
hint which in new versions of SQL Server, mitigates the effect of parameter sniffing for when execution plans are for massively different sizes of data.
PROCEDURE [dbo].[Session_Aggregate]
-- Add the parameters for the stored procedure here
@workerId varchar(64) = 0,
@timeThresh dateTime = '13 July 2007 11:27:46',
@sessions INT OUTPUT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for procedure here
CREATE TABLE #startTimes
(
startTime DATETIME
);
CREATE INDEX Idx_startTime ON #startTimes(startTime);
INSERT INTO #startTimes
SELECT timeStamp FROM workerLog
WHERE dwellTime is null AND timeInSession = 0
AND workerId = @workerId AND timeStamp >= @timeThresh;
INSERT INTO e_activeSessions
SELECT *
FROM (
SELECT workerId, startTime, COUNT(*) as totalTasks, MAX(timeInSession) as totalTime,
MIN(dwellTime) as minDwell, MAX(dwellTime) as maxDwell, AVG(dwellTime) as avgDwell, STDEV(dwellTime) as stdevDwell,
SUM(CAST(wrong80 as INT)) + SUM(CAST(correct80 as INT)) as total80, SUM(CAST(correct80 as INT)) as correct80,
SUM(CAST(correct80 as FLOAT)) / NULLIF(SUM(CAST(wrong80 as INT)) + SUM(CAST(correct80 as INT)), 0 ) as percent80
FROM (
SELECT *, (SELECT MAX(startTime) FROM #startTimes where startTime <= workerLog.timeStamp) as startTime
FROM workerLog where workerId = @workerId) t
GROUP BY startTime, workerId) f
WHERE startTime is NOT NULL AND f.totalTasks > 1 AND totalTime > 0
OPTION (OPTIMIZE FOR UNKNOWN);
SET @sessions = @@ROWCOUNT;
END;
Additional simplification: drag the SP to your DBML file and you can do the following:
foreach (string worker in workers)
{
int? rows = 0;
_gzClasses.Session_Aggregate(worker, SecondThreshold, ref rows);
Console.WriteLine("Inserted {1} sessions for {0}", worker, rows);
}
Fire up SQLServerProfiler and that can give you the difference between your single query and the way you are running it now.
http://www.techrepublic.com/article/step-by-step-an-introduction-to-sql-server-profiler/5054787
But more importantly you should probably look at the query execution plan which you can turn on in SSMS via the Query tile and select show execution plan.
If you are really new to SSMS I would probably read a couple of articles on top of what I provided, but the query execution plan will really show you where your query is lagging. (basic rule of thumb is that you don't want full table scans to occur, you want it to be doing seeks, which means you want it to be searching on indexes and/or primary keys) I am no dba but that is the route you would probably want to take when debugging your query.
I am not so sure it is your query after reviewing though as it looks to be pretty straightforward. It may have to do with the number of times you are calling it though. You may want to figure out a way to pass all your workers data into the query so that you just run the query itself once versus running it workers.count times......HTH