Search code examples
sqlmodelingstar-schemafact-table

What can I improve in this dimensional model?


enter image description here

Hi, I am new to Kimball Dimensional Modeling. I'm wondering if someone could give me some feedback as to what I can improve in this star schema model? It's a model with a grain of one row per payment, per customer. It also has a customer dimension set up as a slowly-changing dimension type 2. Thanks!


Solution

  • Overall, it's a good start and it might work just fine, depending on your analytical needs. The only serious issue I see is that you are missing calendar dimensions in your model. At least two: one for Due Date and one for Payment date. These dates should be FK in the fact table.

    Less importantly, I would also consider moving some numeric attributes from Loan dimension into fact tables. For example, Downpayment amount could belong to Customer Payments table. To differentiate payments of different types, I would add a dimension "Payment Type", where you can have values like "Downpayment", "Installment Payment", "Pre-Payment", etc. One of the benefits of such design is that summarizing all relevant customer payments is easy.

    Sometimes it's hard to decide if a numeric value should be part of a fact table or a dimension, or both. A good decision-making rule is to think how you intend to use the number. If you need it for calculations (i.e, "show me all downpayments", or "calculate total amount a customer paid to the loan, including downpayment"), or time-series charting, then it belongs to a fact table. If you need it for filtering (i.e, "show me customer payments by the size of their downpayment"), then it should be a dimensional attribute.

    In your case, I think it's unlikely that you will filter by downpayment size, or interest rate, or specific ratio. Therefore, these numbers belong to fact tables - they are quantitative facts, not filters.

    Instead of keeping specific numbers in the dimensions, designers often group them and create dimensional attributes from the groups. For example, you can create an attribute "Downpayment Level", with values "0 - 20%", "20 - 50%", ">50%". Such attribute will be useful for classification analysis. On the other hand, a specific downpayment amount should belong to a fact table.

    Why it matters: if you model a number in a wrong place, you will find yourself writing increasingly complicated code for calculations. Correct designs lead to simple and elegant queries. So, if at some point you find that your queries are getting too complicated, it often is a symptom that a fact was incorrectly modeled as an attribute.