Search code examples
sqlsql-serverperformancedata-storage

Performance consideration: Spread rows in multiple tables vs concentrate all rows in one table


Performance consideration: Spread rows in multiple tables vs concentrate all rows in one table.

Hi.

I need to log information about about every step that goes on in the application in an SQL DB. There are certain tables, I want the log should be related to: Product - should log when a product has been created changed etc. Order - same as above Shipping - same etc. etc. etc.

The data will be need to be retrieved often.

I have few ideas on how to do it:

  1. Have a log table that will contain columns for all these tables, then when I wanna represent data in the UI for a certain Product will do select * from Log where LogId = Product.ProductId. I know that this might be funny to have many cols, but I have this feeling that performance will be better. In the other hand there will be a huge amount of rows in this table.
  2. Have many log tables for each log type (ProductLogs, OrderLogs etc.) I really don't like this idea since it's not consistent and have many tables with same structure doesn't make sense, but (?) it might be quicker when searching in a table that has a lower amount of rows (m i wrong?).
  3. According to statement no. 1, I could do a second many-to-one table that will have LogId, TableNameId and RowId cols, and will reference a log row to many table rows in the DB, than will have a UDF to retrieve data (e.g. log id 234 belongs to table Customer at CustomerId 345 and to Product table where productId = RowId); I think this is the nicest way to do it, but again, there might be a huge amount of rows, will it slow down the search? or this is how it should be done, whatcha say?...

Example of No. 3 in the above list:

CREATE TABLE [dbo].[Log](
    [LogId] [int] IDENTITY(1,1) NOT NULL,
    [UserId] [int] NULL,
    [Description] [varchar](1024) NOT NULL,
 CONSTRAINT [PK_Log] PRIMARY KEY CLUSTERED 
(
    [LogId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
ALTER TABLE [dbo].[Log]  WITH CHECK ADD  CONSTRAINT [FK_Log_Table] FOREIGN KEY([UserId])
REFERENCES [dbo].[Table] ([TableId])
GO
ALTER TABLE [dbo].[Log] CHECK CONSTRAINT [FK_Log_Table]
---------------------------------------------------------------------
CREATE TABLE [dbo].[LogReference](
    [LogId] [int] NOT NULL,
    [TableName] [varchar](32) NOT NULL,
    [RowId] [int] NOT NULL,
 CONSTRAINT [PK_LogReference] PRIMARY KEY CLUSTERED 
(
    [LogId] ASC,
    [TableName] ASC,
    [RowId] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[LogReference]  WITH CHECK ADD  CONSTRAINT [FK_LogReference_Log] FOREIGN KEY([LogId])
REFERENCES [dbo].[Log] ([LogId])
GO
ALTER TABLE [dbo].[LogReference] CHECK CONSTRAINT [FK_LogReference_Log]
---------------------------------------------------------------------
CREATE FUNCTION GetLog
(   
    @TableName varchar(32),
    @RowId int
)
RETURNS 
@Log TABLE
(       
    LogId int not null,
    UserId int not null,
    Description varchar(1024) not null
)
AS
BEGIN

INSERT INTO @Log
SELECT     [Log].LogId, [Log].UserId, [Log].Description
FROM         [Log] INNER JOIN
                      LogReference ON [Log].LogId = LogReference.LogId
WHERE     (LogReference.TableName = @TableName) AND (LogReference.RowId = @RowId)
    RETURN 
END
GO

Solution

  • I would definitely go for option 3, for several reasons:

    Data should be in the fields of a table, not as a table name (option 2) or a field name (option 1). That way the database gets easier to work with and easier to maintain.

    Narrower tables genrally perform better. The number of rows has less impact on the performance than the number of fields.

    If you have a field for each table (option 1), you are likely to get a lot of empty fields when only a few of the tables are affected by an operation.