Search code examples
database-designbusiness-intelligencestar-schemamulti-databasedatamart

Difficulty modeling Star schema


I am creating a fact table which will measure the amount of resources related to schools:

Amount of Schools
Amount of Students
Amount of Employees
Amount of computers (for administrative use and students use)

FACT_SCHOOL_RESOURCES
    School (FK)
    Location (FK)
    Education Type (FK)
    Total_Schools (every line has value 1)
    Total_Students
    Total_Employees
    Total_Students_Computers
    Total_Administration_Computers

There are some infrastructure issues that are related to schools, which I can't actually put as one of its attributes (an attribute named infrastructure) because there are more than one type (Energy Source, Water Source, Waste Disposal...) and I can't use each type as an attribute, because for each type, I have a many-to-many relationship with my school dimension. For example:

Power Source
    Solar Panel
    Generator
    City Energy Supply
    ...   
Waste Disposal
    Landfill
    Incineration
    Recycling
    Composting
    Waste to Energy
    ...

Using these examples, a school can have both Landfill and Recycling as its waste disposal or make use of the City Energy Supply at the same time as a Generator or Solar Panels.

All of this is necessary because studies show that some of these issues (and many others) are related to teaching quality and there should be a way to analyse it all.

How do I solve this on a star schema?


Solution

  • I think you are looking for something that goes beyond star schema. If you want to do some multi-variant analysis on which factors are most strongly correlated with some metric (e.g. teaching quality) then you might need to consider a cube.

    Treat the presence of each of the types of power source and waste disposal as an attribute. Perhaps you can start with something as simple as binary flags for each factor. You can then test for correlation between your metric and your factors with different slices in the cube.