Search code examples
sqlsql-serverdatabasedata-warehousefact-table

Is Id column required / recommended in fact table in the given scenario


I am using a fact table with the following structure in SQL Server 2012:

CREATE TABLE [dbo].[factTable] (
    [Id]            BIGINT      IDENTITY (1, 1) NOT NULL,
    [Date]          DATE        NOT NULL,
    [MinuteNumber]  SMALLINT    NOT NULL,
    [CityId]        INT         NOT NULL, /* Foreign key to dimCity */
    [Value]         DECIMAL(12, 4)  NULL
)

I have a clustered index on the Date column with a fill factor of 100. The data inserted into this table is almost always in the ascending order of Date and MinuteNumber.

  1. I want to know - if having the Id column is necessary in the given scenario? Does it have any performance implications? Or can I safely eliminate it.

  2. I also want to know if having clustered index on Date column is sufficient (there will be many records with the same date, even same date and same minute-number) or is it better to have a clustered index combining multiple columns; and what are the performance and storage implications for either approach?

I am new to this and any help will be highly appreciated.


Solution

  • In your case, I'd probably create a nonclustered primary key on the identity column, to allow for easier FK relationship management and for performance.

    The clustered key would be on the date column, to allow for faster range queries. The date column also fulfills the three basic requirements for a clustered index: it's narrow (to make nonclustered indexes smaller), it's stable (because a change on a CI column means reshuffling the NC indexes as well, this is to be avoided) and it's increasing (to avoid bad page splits, the ones not at the end of the table).

    WRT non-unique clustered index, SQL Server will add a uniquifier data to it if it's not unique.