Search code examples
mysqldatabaseimplementationtemporal-database

How to implement temporal data in MySQL


I currently have a non-temporal MySQL DB and need to change it to a temporal MySQL DB. In other words, I need to be able to retain a history of changes that have been made to a record over time for reporting purposes.

My first thought for implementing this was to simply do inserts into the tables instead of updates, and when I need to select the data, simply doing a GROUP BY on some column and ordering by the timestamp DESC.

However, after thinking about things a bit, I realized that that will really mess things up because the primary key for each insert (which would really just be simulating a number of updates on a single record) will be different and thus mess up any linkage that uses the primary key to link to other records in the DB.

As such, my next thought was to continue updating the main tables in the DB, but also create a new insert into an "audit table" that is simply a copy of the full record after the update, and then when I needed to report on temporal data, I could use the audit table for querying purposes.

Can someone please give me some guidance or links on how to properly do this?
Thank you.


Solution

  • Make the given table R temporal(ie, to maintain the history).

    One design is to leave the table R as it is and create a new table R_Hist with valid_start_time and valid_end_time. Valid time is the time when the fact is true.

    The CRUD operations can be given as:

    INSERT

    • Insert into both R
    • Insert into R_Hist with valid_end_time as infinity

    UPDATE

    • Update in R
    • Insert into R_Hist with valid_end_time as infinity
    • Update valid_end_time with the current time for the “latest” tuple

    DELETE

    • Delete from R
    • Update valid_end_time with the current time for the “latest” tuple

    SELECT

    • Select from R for ‘snapshot’ queries (implicitly ‘latest’ timestamp)
    • Select from R_Hist for temporal operations

    Instead, you can choose to design new table for every attribute of table R. By this particular design you can capture attribute level temporal data as opposed to entity level in the previous design. The CRUD operations are almost similar.