Search code examples
sqlloggingdatabase-performance

What is the correct way to structure a high volume log record


I have a requirement to log application events in a SQL 2012 database. The basic record structure requirement is pretty simple:

CREATE TABLE [dbo].[EventLog]
(
   [ProcessId] INT NOT NULL,
   [ApplicationId] INT NOT NULL,
   [Created] DateTime NOT NULL,
   CONSTRAINT [PK_EventLog] PRIMARY KEY CLUSTERED ([ProcessId],[ApplicaionId],[Created] ASC)
)

The problem I am having is one of performance. Up to 1 million events per day can be generated and as the number of rows increase, the insert performance is diminishing - to the point where the logger will not be able to keep up with the events.

I am already writing batches of logs out to a intermediary plain text files and then processing these files using a service running separately from the main application logger.

I suspect that the culprit may be maintaining the index and I would like some advice on how I can approach this problem more efficiently/effectively.

Any advice would be much appreciated.


Solution

  • The main cause of the performance problem is probably the choice of columns forming the clustered index.

    In a clustered index, the data is actually stored in the leaf-level pages of the index, in the order defined by the index key columns. Hence, in your table, the data is stored in the order ProcessID, ApplicationID, Created.

    Without seeing your data, I would assume that log entries are being created as time passes for a variety of ProcessIDs and ApplicationIDs. If this is the case, for every insert, SQL will actually be inserting each log entry at the appropriate point in the middle of your log table. This is more time-consuming for SQL Server to do than inserting records at the end of the table. Also, when an inserted record cannot fit on the appropriate page, a page split will occur which will result in the clustered index being fragmented - which will decrease the performance further.

    Ideally, you should aim to have a clustering key that is a small as possible while also being unique. Therefore one approach would be to create a new ID column as an identity and create a clustered index on that. For example:

    CREATE TABLE [dbo].[EventLog]
    (
      [EventLogId] INT IDENTITY(1,1),
      [ProcessId] INT NOT NULL,
      [ApplicationId] INT NOT NULL,
      [Created] DateTime NOT NULL,
      CONSTRAINT [PK_EventLog] PRIMARY KEY CLUSTERED ([EventLogId])
    )