Search code examples
sqldatabaseschemarelational-algebra

Is a separate database table required if it's in a many-to-many relationship and only has a single column?


Say I have a system that catalogues books and I want to group them into genres and there is a UNF tuple:

BookId (PK): number
BookTitle: string
Blurb: string
Genres: string[]
(... everything else)

When normalised, Genre needs to be made atomic, and also needs to be put into it's own relation in order to no cause duplication. My first instinct is to do the following:

BOOK
BookId (PK): number
BookTitle: string
Blurb: string

GENRE
GenreId (PK): number
GenreName: string

BOOKGENRE
BookId (CK): number
GenreId (CK): number

But another idea occurred to me but something felt.. wrong(?) about it, and like it was bad practise. But my idea was to get rid of the GENRE table by simply making the GenreName field the primary key and only making use of BOOKGENRE. Since GenreName would still be indexed as it's part of a composite key it would still be indexed and efficient to perform operations on, such as getting all books with Genre "X".

BOOK
BookId (PK): number
BookTitle: string
Blurb: string

BOOKGENRE
BookId (CK): number
GenreName (CK): string

Is this acceptable practice or is it required that I keep genre as a separate table with a key?


Solution

  • I would recommend sticking to the original design with three separate tables, and an integer primary key. One reason is that you want keys that are meaningless, ie that have no business signification.

    With the alternative design that you are thinking about, what if you want to rename a genre, say from Sci-Fi to Science Fiction? You would need to update every row in the bookgenre table, which is unecessarily complex.

    Primary keys should never be updated, so if there is a risk that you might need to ever change the value of a genre, you should go with surrogate keys: this will make your life easier on the long run.