sql sql-server database database-design data-warehouse

How do I add new dimensions to a fact table without having to rebuild the table?

I'm new to data warehousing so apologies if this is super basic, but I'm curious about this concept.

Example: say I have a table that stores aggregated analytics for a website, (e.g. the total pageviews for a url on a given date).

dbo.PageFacts

Date	Url_Id	Page_Views
2020-01-01	1	280
2020-01-01	2	50
2020-01-02	3	10

Now let's say I'm tasked with adding a new "Device_Id" column to this table (foreign keyed to its own dimension table), and now the table is split out like this...

Date	Url_Id	Device_Id	Page_Views
2020-01-01	1	101	180
2020-01-01	1	102	100
2020-01-01	2	101	50
2020-01-02	3	101	10

How do most people go about doing this for a fact table with millions of records? I'm assuming you don't want to rebuild it every time new dimensions are added in like, especially when it requires updating the aggregated value.

I was thinking of creating a surrogate key on the fact table and then creating a separate dimension table with percent breakdowns (e.g. 0.75, 0.25, etc) and then just building a VIEW that joins them together and computes the new values? Something like this...

CREATE TABLE PageFacts (
    PageSurKey INT PRIMARY KEY,
    Date DATE,
    Url_Id INT,
    Page_Views INT
);
CREATE TABLE Device_Pct (
    Id INT PRIMARY KEY,
    PageSurKey INT FOREIGN KEY REFERENCES PageFacts(PageSurKey),
    Device_Id INT,
    Percentage FLOAT
);


CREATE VIEW Device_PageFacts AS
SELECT
    pf.Date,
    pf.Url_Id,
    d.Device_Id,
    SUM(pf.Page_Views * d.Percentage) as Page_Views
FROM PageFacts pf
JOIN Device_Pct d on d.PageSurkey = pf.PageSurkey
GROUP BY pf.Date, pf.Url_Id, d.Device_Id;

This seems like the way to go (since adding in new dimensions which further slice the data only requires knowing the ratio to split by), but I don't know if there's a better practice. Any insight would help. Thanks in advance.

Solution

Normally you will rebuild the fact table if you change its grain. That's a pretty significant design change to the dimensional model, and I don't see any benefit to complicating the schema to avoid rebuilding the fact table.

If you want, you could introduce this as an additional fact table, but it wouldn't have "percentages" just a normal fact table:

CREATE TABLE PageDeviceFacts 
(
    Date DATE references DimDate,
    Url_Id INT references DimUrl,
    Device_ID INT references DimDevice,
    Page_Views INT,
    constraint pk_PageDeviceFacts
      primary key (Date,Url_Id,Device_ID)
);

And if it's a big table, make the PK non-clustered and create a clustered columnstore index

create clustered columnstore index cci_PageDeviceFacts on PageDeviceFacts

And the old PageFacts table becomes an aggregate fact in the model.