Search code examples
mysqlsqldatabase-normalizationjoin

Is this good Database Normalization?


I am a beginner at using mysql and I am trying to learn the best practices. I have setup a similar structure as seen below.

(main table that contains all unique entries) TABLE = 'main_content'

+------------+---------------+------------------------------+-----------+
| content_id |  (deleted)    | title                        | member_id | 
+------------+---------------+------------------------------+-----------+
|          6 |               | This is a very spe?cal t|_st |      1    |
+------------+---------------+------------------------------+-----------+ 

(Provides the total of each difficulty and joins id --> actual name) TABLE = 'difficulty'

+---------------+-------------------+------------------+
| difficulty_id | difficulty_name   | difficulty_total |
+---------------+-------------------+------------------+
|             1 | Absolute Beginner |                1 |
|             2 | Beginner          |                1 | 
|             3 | Intermediate      |                0 |
|             4 | Advanced          |                0 |
|             5 | Expert            |                0 |
+---------------+-------------------+------------------+

(This table ensures that multiple values can be inserted for each entry. For example, this specific entry indicates that there are 2 difficulties associated with the submission) TABLE = 'lookup_difficulty'

+------------+---------------+
| content_id | difficulty_id |
+------------+---------------+  
|          6 |             1 |
|          6 |             2 |
+------------+---------------+

I am joining all of this into a readable query:

SELECT group_concat(difficulty.difficulty_name) as difficulty, member.member_name
FROM main_content
INNER JOIN difficulty ON difficulty.difficulty_id 
IN (SELECT difficulty_id FROM main_content, lookup_difficulty WHERE lookup_difficulty.content_id = main_content.content_id )
INNER JOIN member ON member.member_id = main_content.member_id

The above works fine, but I am wondering if this is good practice. I practically followed the structure laid out Wikipedia's Database Normalization example.

When I run the above query using EXPLAIN, it says: 'Using where; Using join buffer' and also that I am using 2 DEPENDENT SUBQUERY (s) . I don't see any way to NOT use sub-queries to achieve the same affect, but then again I'm a noob so perhaps there is a better way....


Solution

  • If the lookup_difficulty provides a link between content and difficulty I would suggest you take out the difficulty_id column from your main_content table. Since you can have multiple lookups for each content_id, you would need some extra business logic to determine which difficulty_id to put in your main_content table (or multiple entries in the main_content table for each difficulty_id, but that goes against normalization practices). For ex. the biggest value / smallest value / random value. In either case, it does not make much sense.

    Other than that the table looks fine.


    Update

    Saw you updated the table :)

    Just as a side-note. Using IN can slow down your query (IN can cause a table-scan). In any case, it used to be that way, but I'm sure that these days the SQL compiler optimizes it pretty well.