Search code examples
arrayspostgresqldatabase-designdenormalization

what is difference between creating another table and using arrays in postgres


In my university, I was given a task to create a simple university database with some tables like student, departments and etc. There was an interesting moment when I did relations between students and classes, one student may choose multiple classes, I was taught to create the third table with two FKs, and it should look like this Adam(id - 1) takes math course(id - 5) and in the third table, the record would be (1, 5), and here is a question why should we prefer third table instead of Arrays, for me it looks much easier to hold student's classes as an additional column in the student table. Here is an example, imagine student table (id, name, age, arrayOfClassesID) (1, Adam, 20, [1,8,9,6,7]).

P.S. This is not my homework, I have already done it, but it is really interesting for me


Solution

  • From a theoretical point of view, a problem is that such a design violates the first normal form. This has practical repercussions:

    • As a_horse_without_name commented, the number one problem is that you cannot have foreign key constraints on array elements.

    • Indexing for fast searches only works in a limited fashion using GIN indexes and the @> operator. Searches with LIKE or > cannot be optimized.

    • Searching all students for a class will be less efficient, even with a GIN index, and the query will be more complicated and less intuitive.

    • If you want to remove a class for the student, you'll have to rewrite the whole array. This is maybe no big problem here, but with longer arrays it can hurt.

    Using a mapping table is just the natural way to do this in a relational database.