Search code examples
mysqldatabase-designforeign-keysconstraints

MySQL with Soft-Deletion, Unique Key and Foreign Key Constraints


Say I have two tables, user and comment. They have table definitions that look like this:

CREATE TABLE `user` (
  `id`       INTEGER NOT NULL AUTO_INCREMENT,
  `username` VARCHAR(255) NOT NULL,
  `deleted`  TINYINT(1) NOT NULL DEFAULT 0,
  PRIMARY KEY (`id`),
  UNIQUE KEY (`username`)
) ENGINE=InnoDB;
CREATE TABLE `comment` (
  `id`      INTEGER NOT NULL AUTO_INCREMENT,
  `user_id` INTEGER NOT NULL,
  `comment` TEXT,
  `deleted` TINYINT(1) NOT NULL DEFAULT 0,
  PRIMARY KEY (`id`),
  CONSTRAINT `fk_comment_user_id` FOREIGN KEY (`user_id`)
    REFERENCES `user` (`id`)
    ON DELETE CASCADE
    ON UPDATE CASCADE
) ENGINE=InnoDB;

This is great for enforcing data integrity and all that, but I want to be able to "delete" a user and keep all its comments (for reference's sake).

To this end, I've added deleted so that I can SET deleted = 1 on a record. By listing everything with deleted = 0 by default, I can hide away all the deleted records until I need them.

So far so good.

The problem comes when:

  • A user signs up with a username (say, "Sam"),
  • I soft-delete that user (for unrelated reasons), and
  • Someone else comes along to sign up as Sam, and suddenly we've violated the UNIQUE constraint on user.

I want users to be able to edit their own usernames, so I shouldn't make username the primary key, and we'll still have the same problem when deleting users.

Any thoughts?

Edit for clarification: Added following RedFilter's answer and comments below.

I'm concerned with the case where the "deleted" users and comments are not visible to the public, but are visible only administrators, or are kept for the purpose of calculating statistics.

This question is a thought experiment, with the user and comment tables just being examples. Still, username wasn't the best one to use; RedFilter makes valid points about user identity, particularly when the records are presented in a public context.

Regarding "Why isn't username the primary key?": this is just an example, but if I apply this to a real problem I'll be needing to work within the constraints of an existing system that assumes the existence of a surrogate primary key.


Solution

  • Add unique constraint on fields(username, deleted) Change field type for 'deleted' to INTEGER.

    During delete operation (it can be done in trigger, or in part of code where you need actually delete user) copy value of id field to deleted field.

    This approach allow you:

    • keep unique names for active users (deleted = 0)
    • allow delete users with same username several times

    Field 'Deleted' can't have only 2 value because the following scenario will not work:

    1. you create user 'Sam'
    2. User Sam is deleted
    3. You create new user witn userName 'Sam'
    4. You try delete user with userName 'Sam' - fail. You already have record userName = 'Sam' and deleted = '1'