Search code examples
mysqlredundancy

Redundant data or two queries?


Take, for instance, a table that holds credit card charges, a table that holds credit cards, and a table that holds users.

Each charge is associated with exactly one card, and each card is associated with exactly one user. A user may have multiple cards on file.

If I were to hold this data in a MySQL database in three distinct tables, like so:

charges:

---------------------------------------------
id | card | amount | description | datestamp
---------------------------------------------
5  | 2    | 50.00  | Example     | 1369429422

cards:

------------------------------------------------------------------
id | user | name       | number           | cvv2 | exp_mm | exp_yy
------------------------------------------------------------------
2  | 1    | Joe Schmoe | 4321432143214321 | 555  | 1      | 16

users:

-------------------------------------------
id | first_name | last_name | email
-------------------------------------------
1  | Joe        | Schmoe    | joe@schmoe.co

Now, let's say that I want to access a user, given a charge. In order to get to the user, I would first have to look up the card associated with the charge, and then look up the user associated with the card. Obviously, in an example like this, speed would be negligible. But in other scenarios, I see this as being two queries.

However, if I stored the data like so:

charges

----------------------------------------------------
id | card | user | amount | description | datestamp
----------------------------------------------------
5  | 2    | 1    | 50.00  | Example     | 1369429422

then the charge would be associated directly with a user. That, however, is redundant information, since that same data is stored in the cards table.

Thoughts?


Solution

  • Your instinct not to include the user information in the charges table is correct; however, it's still only one query:

    select first_name, last_name, email
    from users, cards, charges
    where users.id = cards.user
    and cards.id = charges.card
    and charges.id = 5;
    

    That would give you the user info for the charge with id 5. This is the exact thing that relational databases are best at :) This kind of thing is called a "join" because it joins multiple tables together to give you the information you need. There are multiple ways to write this query.

    As an aside, perhaps this is a contrived example, but if this is an application you are writing from scratch, there are lots of reasons to avoid storing credit cards in your own database. Often a payment processor can handle the details for you while still allowing you to charge credit cards. More info.