Our team is planning to add missing dimension keys (FK's) to our fact tables to allow for direct querying between our dimensions and fact tables rather than going through tables. Essentially, leaning more towards a star schema. This will improve querying performance (as less joins will be required).
I have read online that Foreign Key (FK's) constraints will impact performance of insert, updates and deletes.
Are there other things that we should consider?
The main disadvantage is the increase in size for the fact table records. You don't specify the sizes of the rows, but adding new foreign keys could significantly impact the size of the row.
In most databases, increased row size in the fact table would slow down queries. However, that would be less of an issue in a columnar database.
In fact, the increased size could outweigh the advantages of avoiding joins.
In general, I would not worry about joins on small- to medium- sized dimension tables that are on the primary key. That is, do a lot of testing on full-scale data before making such radical changes.
The affect on insert
s is simply that the keys need to be validated. However, this is checking against the primary keys of reference tables, so those checks are pretty fast. More are obviously more expensive than fewer, but the overhead will probably not be a show-stopper.
One thing that could be an advantage or disadvantage is that the dimensions are static at the time of input. Updating a fact table is generally quite expensive. However, dimensions do change. For instance, new countries are occasionally created. How is this represented in the historical data?
In a snowflake schema, your lowest level of geography would (probably) be more refined than a country, and only the dimensions need to be updated. And, if the lowest level needs to be updated in the fact table, that would affect relatively few rows.