I am in the process of planning my DB in MySQL for the following scenario -
I am trying to build a play by play system such as NBA where, events such as
Play event[id, play_type_id, play_outcome_id, points]
Substitution event[id, player1_in_id, player2_out_id]
Foul event[id, foul_type_id, player_id]
In the main play_by_play scorecard a particular event should be linked with a particular time and match
Play_by_Play[id, match_id, time_id, play_event_id, substitution_event_id, foul_event_id]
However assuming at one particular minute only one of the three events takes place- only one event record out of three will have id, and others will be either 0 or NULL. I am left with the question is this good Normalisation for this design or is there any better way to do it ?
I do not think there is only one answer and it is highly dependent on how you use your data.
One way is to have Play_by_Play
table include a single reference to an event and thus avoiding the overhead of NULLs and 0s.
Play_by_Play[id, match_id, time_id, event_id]
You can define a generic event
Event_type [event_type_id, type_name]
Generic_Event[event_id, event_type_id]
and all the other event types can be 1:1 extensions for this generic event type:
Play event[id, event_id, play_type_id, play_outcome_id, points]
Substitution event[id, event_id, player1_in_id, player2_out_id]
Foul event[id, event_id, foul_type_id, player_id]
This design favors normalization and allows to quickly select information about the events regardless of their type (your design requires multiple joins for this).
However, aggregating information will require more joins (to the tables actually holding relevant data for events) and also Generic_Event
might grow very much with potential effects upon performance.
Your way favors faster aggregation of information, as you can query only Play_by_Play
table. E.g.: get number of substitutions and number of fouls within a time frame.
It might also more space and query execution efficient, as you use less JOIN
s and do not have a big event table.
It is unclear what is your actual database engine (you tagged both MySql and SQL Server), but for SQL Server there is a feature that helps you optimize space for tables with lots of NULL values: sparse columns.