tracking actual changes in rows when updating table

I have a table which i keep on updating, from the values of other source in my code. The value I update may or may not be same as the value already in the row.

I need some kind of algorithm may be via mysql (db) or otherwise (part of code) so that I later may be able to identify which rows have a changed value.

There is a date modified column which I change. But that will not be a true indicator as it will always be updated. I want a way by which I can determine whether some predefined columns have changed values,

One solution is this: I can do a select query, then compare and update a changed flag in the table. But that seems complex and not for me as I have a table with a lot of records

Another solution might be to save the md5 checksum of the values in a column and while updating compare the previous md5 and current md5 and so on.

I want to know the best solution.

Solution

There's a fairly simple way to handle this problem. Let's think of it as managing when a row's timestamp gets updated.

First of all, as I'm sure you know, your table needs a timestamp column with default settings for INSERT and UPDATE. That looks like this if the column is called ts.

ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP

Second, you can use an UPDATE query like this to change the values in a row.

UPDATE stock
   SET val1 = 'newval1', 
       val2 = 'newval2', 
       changed_by = 'current_user_id'
 WHERE id = 'id_to_change'
   AND NOT (val1 == 'newval1' AND val2 == 'newval2')

The AND NOT clause on the WHERE will prevent the update from taking place unless 'newval1' or 'newval2' would actually provide new values. This works because no rows match the WHERE clause in the update.

When the update is prevented from taking place your automatically set ts column will not change. Neither will the changed_by column be set to the present user's user_id. So, you have the time and user of the most recent genuine change.

Also, many host language interfaces to MySQL have a method call to determine how many rows were affected by a recent UPDATE operation. With this technique, you'll get back zero rows when the row is not updated. That might be convenient for your user interface.

Also, this technique uses a single query, so it's safe if more than one user is trying to update the same row at the same time. There's no need to use a transaction to guarantee that.

(Note that tracking the changed_by user is optional, and will only work if your application can provide the current user's actual id.)

This is reasonably efficient as long as the database search for WHERE id = 'id_to_change' works quickly.

It does require reworking your application's UPDATE queries. But so would any other approach to this problem.