Search code examples
sqldatabase-designdelete-row

Are there problems with this 'Soft Delete' solution using EAV tables?


I've read some information about the ugly side of just setting a deleted_at field in your tables to signify a row has been deleted.

Namely
http://richarddingwall.name/2009/11/20/the-trouble-with-soft-delete/

Are there any potential problems with taking a row from a table you want to delete and pivoting it into some EAV tables?

For instance.

Lets Say I have two tables deleted and deleted_row respectively described as follows.

    mysql> describe deleted;
    +------------+--------------+------+-----+---------+----------------+
    | Field      | Type         | Null | Key | Default | Extra          |
    +------------+--------------+------+-----+---------+----------------+
    | id         | int(11)      | NO   | PRI | NULL    | auto_increment | 
    | tablename  | varchar(255) | YES  |     | NULL    |                | 
    | deleted_at | timestamp    | YES  |     | NULL    |                | 
    +------------+--------------+------+-----+---------+----------------+

    mysql> describe deleted_rows;
    +--------+--------------+------+-----+---------+----------------+
    | Field  | Type         | Null | Key | Default | Extra          |
    +--------+--------------+------+-----+---------+----------------+
    | id     | int(11)      | NO   | PRI | NULL    | auto_increment | 
    | entity | int(11)      | YES  | MUL | NULL    |                | 
    | name   | varchar(255) | YES  |     | NULL    |                | 
    | value  | blob         | YES  |     | NULL    |                | 
    +--------+--------------+------+-----+---------+----------------+

Now when you wanted to delete a row from any table you would delete it from the table then insert it into these tables as such.

    deleted
    +----+-----------+---------------------+
    | id | tablename | deleted_at          |
    +----+-----------+---------------------+
    |  1 | products  | 2011-03-23 00:00:00 | 
    +----+-----------+---------------------+

    deleted_row
    +----+--------+-------------+-------------------------------+
    | id | entity | name        | value                         |
    +----+--------+-------------+-------------------------------+
    |  1 |      1 | Title       | A Great Product               | 
    |  2 |      1 | Price       | 55.00                         | 
    |  3 |      1 | Description | You guessed it... it's great. | 
    +----+--------+-------------+-------------------------------+

A few things I see off the bat.

  1. You'll need to use application logic to do the pivot (Ruby, PHP, Python, etc)
  2. The table could grow pretty big because I'm using blob to handle the unknown size of the row value

Do you see any other glaring problems with this type of soft delete?


Solution

  • Why not mirror your tables with archive tables?

    create table mytable(
       col_1 int
      ,col_2 varchar(100)
      ,col_3 date 
      ,primary key(col_1)
    )
    
    create table mytable_deleted(
       delete_id  int      not null auto_increment
      ,delete_dtm datetime not null
    -- All of the original columns
      ,col_1 int
      ,col_2 varchar(100)
      ,col_3 date 
      ,index(col_1)
      ,primary key(delete_id)
    )
    

    And then simply add on-delete-triggers on your tables that inserts the current row in the mirrored table before the deletion? That would provide you with dead-simple and very performant solution.

    You could actually generate the tables and trigger code using the data dictionary.

    Note that I might not want to have a unique index on the original primary key (col_1) in the archive table, because you may actually end up deleting the same row twice over time if you are using natural keys. Unless you plan to hook up the archive tables in your application (for undo purposes) you can drop the index entirely. Also, I added the time of delete (deleted_dtm) and a surrogate key that can be used to delete the deleted (hehe) rows.

    You may also consider range partitioning the archive table on deleted_dtm. This makes it pretty much effortless to purge data from the tables.