Search code examples
postgresqldatabase-designrelational-databaseyesod

Structuring a database


In my database I need the following relations:

  • Tournament
  • Tournament Participant (Tpart)
    • relates a User to a tournament
  • Round
    • This is a single match
  • Hole
    • relates a Tpart to a round
    • also holds a score

This is my current Persistent Entities:

Tournament
  name Text
  urlName Text
  location GolfCourseId
  startDate Day 
  endDate Day Maybe
  UniqueTName name
  UniqueTUrlName urlName

Tpart
  tournament TournamentId
  userId UserId  
  deriving Show
  deriving Eq

Round
  tourn TournamentId
  name Text
  UniqueRound tourn name
  deriving Show

Hole
  round RoundId
  part TpartId
  score Int 
  deriving Show

I don't know if this is the best structure given the kind of queries I need to do. I need to

  • Get the total score for a round for each Tpart
    This would be done by summing up the score of all Holes related to a specific round and Tpart

    Part | round 1 | round 2 | ...  
    p1   | 56      | 54  
    p2   | 60      | 57  
    
  • Get all the holes and tparts that relate to a round

    Part | hole 1 | hole 2| ...  
    p1   | 3      | 5  
    p2   | 5      | 6  
    

To get the data on the first table it would require summing all the hole scores for each user. Is this an efficient method? Or would it be better to have another entity RndScore, like this:

RndScore
  rnd RoundId
  tpart TpartId
  score Int

This entity could be updated every time a hole entity is updated. Either of those solutions seem rather robust though.


Solution

  • My advice is: You should always start with a clean, normalized logical relational database design without storing redundant data, and trust that the DBMS will derive your data (i.e., answer your queries) well enough. That is what a DBMS is there for. The next step should be to optimize your physical database design, e.g., choose your indexes, your table storage parameters, etc. Depending on your database, your can even materialize your views, so that their results are stored physically etc. Actually adding derived values in your logical database design (such as your your RndScore relation) should be the last resort, as you will have to ensure their consistency manually.

    In general, you should avoid pre-mature optimizations: Ensure that you actually need to optimize your database layout (e.g., by measuring runtimes, checking query execution plans, making estimations about the number of queries you will have to answer, etc.)