Search code examples
sqldjangosqlitedatabase-designforeign-key-relationship

Best database design for multiple entity types


I'm working on a web app and I have to design it's database. There's a part that didn't come very straightforward to me, so after some thinking and research I came with multiple ideas. Still neither seems completely suitable, so I'm not sure which one to implement and why.

The simplified problem looks as follows: I have a table Teacher. There are 2 types of teachers, according to the relations with their Fields and Subjects:

  1. A Teacher that's related to a Field, the Field is obligatory related to a Category
  2. A Teacher that's not related to a Field, but directly to a Category

My initial idea was to have two nullable foreign keys, one to the table Field, and the other to the table Category. But in this case, how can I make sure that exactly one is null, and the other one is not?

The other idea is to create a hierarchy, with two types of Teacher tables derived from the table Teacher (is-a relation), but I couldn't find any useful tutorial on this.

I'm developing the app using Django with SQLite db


Solution

  • OK, your comment made it much clearer:

    If a Teacher belongs to exactly one category, you should keep this in the Teacher's table directly:

    Secondly each teacher belongs to "one or zero" fields. If this is sure for ever you should use a nullable FieldID column. This is set or remains empty.

    Category (CategoryID, Name, ...)
    Field (FieldID,Name,...)
    Teacher (TeacherID,FieldID [NULL FK],CategoryID [NOT NULL FK], FirstName, Lastname, ...)
    

    Remark: This is almost the same as my mapping table of the last answer. The only difference is, that you'll have a strict limitation with your "exactly one" or "exactly none or one"... From my experience I'd still prefer the open approach. It is easy to enforce your rules with unique indexes including the TeacherID-column. Sooner or later you'll probably have to re-structure this...

    As you continue, one category is related to "zero or more" fields. There are two approaches:

    Add a CategoryID-column to the Field-table (NOT NULL FK). This way you define a field several times with differing CategoryIDs (combined unique index!). A category's fields list you'll get simply by asking the Field-table for all fields with the given CategoryID.

    Better in my eyes was a mapping table CategoryField. If you enforce a unique FieldID you'll get for sure, that no field is mapped twice. And add a unique index on the combination of CategoryID and FieldID...

    A SELECT could be something like this (SQL Server Syntax, untested):

    SELECT Teacher.TeacherID
          ,Teacher.FieldID --might be NULL
          ,Teacher.CategoryID --never NULL
          ,Teacher.[... Other columns ...]
          ,Field.Name --might be NULL
    
          --The following columns you pick from the right source, 
          --depending on the return value of the LEFT JOIN to Field and the related "catField"
          --the directly joined "Category" (which is never NULL) is the "default"
          ,ISNULL(catField.CategoryID,Category.CategoryID) AS ResolvedCategoryID
          ,ISNULL(catField.Name,Category.Name) AS ResolvedCategoryName
          ,[... Other columns ...]
    
    FROM Teacher
        INNER JOIN Category ON Teacher.CategoryID=Category.CategoryID --never NULL
        LEFT JOIN Field ON Teacher.FieldID=Field.FieldID --might be NULL
            LEFT JOIN Category AS catField ON Field.CategoryID=catField.CategoryID
    

    This was the answer before the EDIT: I try to help you even if the concept is not absolutely clear to me

    Teacher-Table: TeacherID, person's data (name, address...), ...
    Category-Table: CategoryID, category title, ... 
    Field-Tabls: FieldID, field title, ...
    

    You say, that fields are bound to a category in all cases. If this is the same category in all cases, you should set the category as a FK-column in the Field-Table. If there is the slightest chance, that a field's category could differ with the context, you should not...

    Same with teachers: If a teacher is ever bound to one single category set a FK-column within the Teacher-table, otherwise don't.

    The most flexible you'll be with at least one mapping table:

    (SQL Server Syntax)

    CREATE TABLE TeacherFieldCategory
    (
    --A primary key to identify this row. This is not needed actually, but it will serve as clustered key index as a lookup index...
     TeacherFieldCategoryID INT IDENTITY NOT NULL CONSTRAINT PK_TeacherFieldCategory PRIMARY KEY
    --Must be set
    ,TeacherID INT NOT NULL CONSTRAINT FK_TeacherFieldCategory_TeacherID FOREIGN KEY REFERENCES Teacher(TeacherID)
    --Field may be left NULL
    ,FieldID INT NULL CONSTRAINT FK_TeacherFieldCategory_FieldID FOREIGN KEY REFERENCES Field(FieldID)
    --Must be set. This makes sure, that a teacher ever has a category and - if the field is set - the field will have a category
    ,CategoryID INT NOT NULL CONSTRAINT FK_TeacherFieldCategory_CategoryID FOREING KEY REFERENCES Category(CategoryID)
    );
    --This unique index will ensure, that each combination will exist only once.
    CREATE UNIQUE INDEX IX_TeacherFieldCategory_UniqueCombination ON TeacherFieldCategory(TeacherID,FieldID,CategoryID); 
    

    It could be a better concept to have a mapping table FieldCategory and this table mapped to the mapping table above through a foreign key. Doing so you could avoid invalid field-category combinations.

    Hope this helps...