I have a periodic check of a certain query (which by the way includes multiple tables) to add informational messages to the user if something has changed since the last check (once a day).
I tried to make it work with checksum_agg(binary_checksum(*))
, but it does not help, so this question doesn't help much, because I have a following case (oversimplified):
select checksum_agg(binary_checksum(*))
from
(
select 1 as id,
1 as status
union all
select 2 as id,
0 as status
) data
and
select checksum_agg(binary_checksum(*))
from
(
select 1 as id,
0 as status
union all
select 2 as id,
1 as status
) data
Both of the above cases result in the same check-sum, 49, and it is clear that the data has been changed.
This doesn't have to be a simple function or a simple solution, but I need some way to uniquely identify the difference like these in SQL server 2000.
checksum_agg appears to simply add the results of binary_checksum together for all rows. Although each row has changed, the sum of the two checksums has not (i.e. 17+32 = 16+33). This is not really the norm for checking for updates, but the recommendations I can come up with are as follows:
checksum_agg
, concatenate the checksums into a delimited string, and compare strings, along the lines of SELECT binary_checksum(*) + ',' FROM MyTable FOR XML PATH('')
. Much longer string to check and to store, but there will be much less chance of a false positive comparison.Every approach, including the one you started with, has pros and cons, with varying degrees of data size and processing requirements against accuracy. Depending on what level of accuracy you require, use the appropriate option. The only way to get 100% accuracy is to store all of the table data.
Alternatively, you can add a date_modified field to each table, which is set to GetDate() using after insert and update triggers. You can do SELECT COUNT(*) FROM #test WHERE date_modified > @date_last_checked
. This is a more common way of checking for updates. The downside of this one is that deletions cannot be tracked.
Another approach is to create a modified table, with table_name (VARCHAR) and is_modified (BIT) fields, containing one row for each table you wish to track. Using insert, update and delete triggers, the flag against the relevant table is set to True. When you run your schedule, you check and reset the is_modified flag (in the same transaction) - along the lines of SELECT @is_modified = is_modified, is_modified = 0 FROM tblModified
The following script generates three result sets, each corresponding with the numbered list earlier in this response. I have commented which output correspond with which option, just before the SELECT statement. To see how the output was derived, you can work backwards through the code.
-- Create the test table and populate it
CREATE TABLE #Test (
f1 INT,
f2 INT
)
INSERT INTO #Test VALUES(1, 1)
INSERT INTO #Test VALUES(2, 0)
INSERT INTO #Test VALUES(2, 1)
/*******************
OPTION 1
*******************/
SELECT CAST(binary_checksum(*) AS VARCHAR) + ',' FROM #test FOR XML PATH('')
-- Declaration: Input and output MD5 checksums (@in and @out), input string (@input), and counter (@i)
DECLARE @in VARBINARY(16), @out VARBINARY(16), @input VARCHAR(MAX), @i INT
-- Initialize @input string as the XML dump of the table
-- Use this as your comparison string if you choose to not use the MD5 checksum
SET @input = (SELECT * FROM #Test FOR XML RAW)
/*******************
OPTION 3
*******************/
SELECT @input
-- Initialise counter and output MD5.
SET @i = 1
SET @out = 0x00000000000000000000000000000000
WHILE @i <= LEN(@input)
BEGIN
-- calculate MD5 for this batch
SET @in = HASHBYTES('MD5', SUBSTRING(@input, @i, CASE WHEN LEN(@input) - @i > 8000 THEN 8000 ELSE LEN(@input) - @i END))
-- xor the results with the output
SET @out = CAST(CAST(SUBSTRING(@in, 1, 4) AS INT) ^ CAST(SUBSTRING(@out, 1, 4) AS INT) AS VARBINARY(4)) +
CAST(CAST(SUBSTRING(@in, 5, 4) AS INT) ^ CAST(SUBSTRING(@out, 5, 4) AS INT) AS VARBINARY(4)) +
CAST(CAST(SUBSTRING(@in, 9, 4) AS INT) ^ CAST(SUBSTRING(@out, 9, 4) AS INT) AS VARBINARY(4)) +
CAST(CAST(SUBSTRING(@in, 13, 4) AS INT) ^ CAST(SUBSTRING(@out, 13, 4) AS INT) AS VARBINARY(4))
SET @i = @i + 8000
END
/*******************
OPTION 2
*******************/
SELECT @out