Search code examples
sql-servercachingwindows-server-2012nosql

10 millions updates per day for stats - What DB / Cache system should I use?


I'm currently facing a performance issue on my website. The situation can be summarized by the following:

  • It's a .Net Mvc website using SQL Server 2012 Express. It's hosted on Windows Server 2012.
  • There are around 10 millions UPDATE queries per day for the core system (5 million on 2 different tables) and another 10 million for custom statistics purpose (again, 2*5 million).
  • There are far fewer INSERT queries.
  • The MVC website has no performance issue, most of the requests are handled by 1 method returning plain Content (not html).

Since statistics are not as important as the core system, and I see SQL Server struggling a lot, I thought it may be good to move these statistics tables somewhere else.

The main question is: What is the best way to handle Stats where updates are predominant? The idea is also to keep only one server.

I tried to have a look of what can be achieved to improve the situation:

  • Having a separate SQL Server database for stats on another hard disk? Maybe SQL Server can breathe better, but I'm not sure.
  • Using a NoSQL database? I just have a minor experience with MongoDb (but not for millions of requests) and I'm tempted to try RavenDB.
  • Using a cache system? Redis looks great but I'm not sure that it's a good idea to run it on Windows. Is AppFabric a viable option?

Any relevant ideas for the current situation will be appreciated.
Thank you

Here are more info about a statistic table I have:

TABLE [dbo].[UserStat](
[Id] [int] IDENTITY(1,1) NOT NULL,
[UserId] [int] NOT NULL,
[Hits] [int] NOT NULL,
[Points] [decimal](38, 6) NOT NULL,
[Date] [datetime] NOT NULL,
[LastHit] [datetime] NOT NULL,
[Ip] [varchar](256) NULL,
)

I update the stat like this:

UPDATE [UserStat] SET Hits = Hits + 1, Points = Points + @Points, LastHit = @Last WHERE UserId = @Id AND [Ip] = @Ip AND [Date] = @Date

If the row for the current user and date does not exists, I create it:

INSERT INTO [UserStat] ([UserId],[Hits],[Points],[Date],[LastHit],[Ip]) VALUES (@UserId,@Hits,@Points,@Date,@LastHit,@Ip)

There are two indices:

  • 1 for the primary key
  • 1 for getting and aggregating the stats of an user

    INDEX [Select_UpdateUserStatNavigation] ON [dbo].[UserStat](
    [UserId] ASC,[Date] ASC) INCLUDE ([Id],[Hits],[Points], [LastHit],[Ip])
    

The server is a VPS. Log and data files are on the same disk. There is no foreign key involved on the table.

Here are all the SELECT queries I found that query this table:

SELECT Points, Hits, [Date] FROM [UserStat] WHERE UserId = @UId AND [Date] >= @date

SELECT Ip FROM [UserStat] WHERE UserId = @UId AND LastHit >= DATEADD(MINUTE,-15,getdate())

SELECT COUNT(Id) FROM [UserStat] WHERE [LastHit] >= DATEADD(MINUTE,-15,getdate())

But I'm not really worried about the SELECT, more about the number of UPDATEs ^^.


Solution

  • Can you confirm that ID is your primary key? If so, then that will be fine as it is a monotonically increasing value and good for Inserts. I think your other index (for the purposes of the updates) should be

    INDEX [Select_UpdateUserStatNavigation] ON [dbo].[UserStat](
    [UserId] ASC,[IP] ASC, [Date] ASC). 
    

    Make sure the columns are ordered from the most selective to the least selective in the index. This should speed up the updates in the sense that rows can be located quicker. We can look at indexes for the SELECT later.

    BY VPS, do you mean it is a virtual server? I would take a look at your IO stats to check that IO is not a bottleneck. How much memory is allocated to SQL? That could be another issue. Insufficient memory could result in paging to disk - the slowest part of your IO subsystem.

    I would take a look at splitting your log and data disks onto separate disks if possible. Having them on the same disk can cause disk contention - again on the slowest part of your IO subsystem.

    Can you post the select queries that are used? I can then give suggested indexes if required.

    In addition, you may want to replace your separate insert and update procs with a MERGE such as below.

    MERGE UserStat AS TargetTable
    USING (SELECT @UserId UserID,@Hits Hits,@Points Points,@Date [Date],@LastHit LastHit,@Ip Ip) AS SourceData
    ON SourceData.UserID = TargetTable.UserID 
        AND SourceData.IP = TargetTable.IP 
        AND SourceData.[Date] = TargetTable.[Date])
    WHEN MATCHED THEN UPDATE SET Hits = Hits + 1, Points = Points + SourceData.Points, LastHit = SourceData.LastHit 
    WHEN NOT MATCHED THEN INSERT (UserID,Hits,Points,[Date],LastHit,Ip)
                     VALUES(SourceData.UserID,SourceData.Hits,SourceData.Points,SourceData.[Date],SourceData.LastHit,SourceData.Ip)