Search code examples
foreign-keysrelationshipnormalizationentity-relationshiptable-relationships

Whats the correct normalization of a relationship of three tables?


I have three tables:

  • teachers
  • classes
  • courses

The sentences is:

  • A teacher may teachs one or more courses.
  • A teacher may teachs one or more classes.
  • A teacher teachs a course for a class.

So I need a fourth table with PRIMARY KEY of each of the three tables composing the PRIMARY KEY or UNIQUE INDEX of the fourth table.

What is the correct normalization for this?

  1. The name of the table: "class_course_teacher" should be ok or I need to use a name like "assignments" for this?
  2. The primary key of the table: "class_id + course_id + teacher_id" should be ok or I need to create "id" for assignments table and this should have "class_id + course_id + teacher_id" as unique index?

Solution

  • Normalization starts with functional dependencies. It's a method of breaking a set of information into elementary facts without losing information. Think of it as logical refactoring.

    Your sentences are a start but not sufficient to determine the dependencies. Do courses have one or more teachers? Do classes have one or more teachers? Do courses have one or more classes? Do classes belong to one or more courses? Can teachers "teach" courses without classes (i.e. do you want to record which teachers can teach a course before any classes are assigned)? Do you want to define classes or courses before assigning teachers?

    Your two questions don't relate to normalization. assignments is a decent name, provided you won't be recording other assignments (like homework assignments), in which case teacher_assignments or class_assignments might be better. class_course_teacher could imply that there can only be one relationship involving those three entity sets, but it can and does happen that different relationships involve the same entity sets.

    I advise against using surrogate ids until you have a reason to use them. They increase the number of columns and indices required without adding useful information, and can increase the number of tables that need to be joined in a query (since you need to "dereference" the assignment_id to get to the class_id, course_id and teacher_id) unless you record redundant data (which has its own problems).