I often find myself questioning whether I'm taking the right approach in trying to plan for future expansibility when creating databases and relations.
I have the following situation:
I have a Donor
table and a Recipient
table. Both tables share common information such as first_name
, last_name
, email_address
, date_of_birth
, etc. Both seem to, if you'll excuse my object-oriented language, share a common abstract type of Person
. It's possible that someone who is at one point a Recipient
may later become a Donor
by means of giving a donation, so it's important that information isn't duplicated across tables. Should I opt for an inheritance pattern, or should I just foreign key Donor
s and Recipient
s to a Person
table?
Initially, I was thinking of simply mapping properties like email_address
and street address properties directly into the things that need them, but then the possibility may arise that a person would have multiple email addresses or mailing addresses (ie: home, work, etc.). What that means is that we have a model somewhat like this:
create table person(id int primary key auto increment, ...,
default_email_address);
create table email_address(id int primary key auto increment,
email varchar(255), name varchar(255), is_default bool, person_id int);
This makes things a bit complicated, as you can imagine. The name
field also involves a list of default values as well as allowing custom input. I can't just make it an enum field, because the possibility exists that someone will have a lot of emails to add that could all be different... (this is the point at which I scream out "IS IT EVEN WORTH IT ANYMORE!?!?" and get frustrated with the project)
I guess what this really boils down to is the following: at what point does data normalization become ludicrous? My goal here is to create a really good as-forward-compatible-as-possible data model that I won't kick myself for creating later.
at what point does data normalization become ludicrous?
At the point that it stops modelling the actual requirements.
To take your examples:
With the Donor
and Recipient
tables, if it is highly likely that any one person will become both, then it does make sense to separate out to a Person
entity. If this is rare, it doesn't.
With the email_address
and street_address
situations, it depends whether you do need to store multiples or not (what is the expectation?). You may want to store separate versions per business unit (say shipping_address
vs billing_address
).