Search code examples
sqldatabase-designmany-to-manyrelationship

Database design- Correct usage of many-to-many relationships between three tables?


Context

I have three tables, Users, Subreddits, and Keywords. The idea is that users can monitor multiple subreddits, and subreddits can keep track of multiple keywords.

Since one user can monitor multiple subreddits, and one subreddit can have multiple users monitoring it, I used a many-to-many relationship between Users and Subreddits.

On the same note, since one subreddit can keep track of multiple keywords, and one keyword can be tracked by multiple subreddits, I used a many-to-many relationship between Subreddits and Keywords.

Implementation

The following is how I implemented each table, and their many-to-many-relationships:

Users

users_subreddits = db.Table('users_subreddits',
    db.Column('user_id', db.Integer, db.ForeignKey('users.id', ondelete='CASCADE')),
    db.Column('subreddit_id', db.Integer, db.ForeignKey('subreddits.id',  ondelete='CASCADE'))
)

class User(db.Model, JsonSerializer):
    __tablename__ = 'users'

    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), index=True, unique=True)
    email = db.Column(db.String(128), index=True, unique=True)
    password_hash = db.Column(db.String(128))
    phone_num = db.Column(db.String(64), index=True, unique=True)
    received_posts = db.Column(db.String(128), index=True, unique=True)

    subreddits = db.relationship('Subreddit', secondary=users_subreddits, backref='users', lazy='dynamic')

    //...

Subreddits

subreddits_keywords = db.Table('subreddits_keywords', db.Model.metadata,
    db.Column('subreddit_id', db.Integer, db.ForeignKey('subreddits.id', ondelete='CASCADE')),
    db.Column('keyword_id', db.Integer, db.ForeignKey('keywords.id',  ondelete='CASCADE')),
)

class Subreddit(db.Model, JsonSerializer):
    __tablename__ = 'subreddits'

    id = db.Column(db.Integer, primary_key=True)
    subreddit_name = db.Column(db.String(128), index=True)

    keywords = db.relationship('Keyword', secondary=subreddits_keywords, backref='subreddits', lazy='dynamic')

    // ...

Keywords

class Keyword(db.Model, JsonSerializer):
    """
    The keywords table.
    """

    __tablename__ = 'keywords'

    id = db.Column(db.Integer, primary_key=True)
    keyword = db.Column(db.String(128), index=True)

Problem

Since I want to keep track of how individual users are monitoring their subreddits and keywords, I append Keywords to Subreddits, and each Subreddit is appended to Users.

For example,

  • user1 monitors subreddit1, which is keeping track of keyword1 and keyword2.
  • keyword1 and keyword2 are appended to subreddit1.
  • subreddit1 is appended to user1.

Here is the problem. Since I am using many-to-many relationships, I would like to keep all data in each table unique. Is this a bad idea?

To illustrate what I mean, consider the following:

  • user1 monitors subreddit1, which is keeping track of keyword1 and keyword2.
  • user2 monitors subreddit1, which is keeping track of keyword3.

Since subreddit1 is unique in the database, keyword1 and keyword2 is appended to subreddit1 by user1, but keyword3 is also appended to subreddit1 by user2.

That means subreddit1 is keeping track of keyword1, keyword2, and keyword3. This is a problem when user2 is only monitoring keyword3 for subreddit1.

Is a many-to-many relationship here a bad idea?

Solutions I've Considered

I can keep different instances of subreddit1 for each user. And each of these subreddit1 will keep track of their respective keywords.

The result will be:

  • user1 monitors subreddit1, which is keeping track of keyword1 and keyword2.
  • user2 monitors subreddit1, which is keeping track of keyword3.

Since there are 2 subreddit1s, user1's subreddit1 is appended with keyword1 and keyword2, while user2's subreddit1 is appended with keyword3.

If I use this approach, what is the point of using a many-to-many relationship? Is there an alternative I can use? I thought the point of a many-to-many relationship is so that each entry in Subreddits can relate to different users and different keywords, without the need to create duplicate entries.


Solution

  • Since your design effectively allows any combination of user, subreddit and keyword (nothing inherently wrong with that) then one solution would be to create a “monitoring” table that holds foreign keys to the 3 tables and records each allowable combination of user/subreddit/keyword. Effectively you are creating a many-to-many table that links 3 tables rather than just 2