Folowing CDC results set for update operation. Only area
field has been updated.
On previous sreenshot only part of gields that contains in table. Fields are much more. Time to time some from them updates some not. In following query i try to show statistic of changes by fields in usable view.
with History AS (
SELECT
cz.GUID as Id,
cz.category,
isnull(cz.area, 0) as area,
isnull(cz.oilwidthmin,0) as oilwidthmin,
isnull(cz.oilwidthmax,0) as oilwidthmax,
isnull(cz.efectivwidthmin,0) as efectivwidthmin,
isnull(cz.efectivwidthmax,0) as efectivwidthmax,
isnull(cz.koafporistmin,0) as koafporistmin,
isnull(cz.koafporistmax,0) as koafporistmax,
CASE cz.__$operation
WHEN 1 THEN 'DELETE'
WHEN 2 THEN 'INSERT'
WHEN 3 THEN 'Before UPDATE'
WHEN 4 THEN 'After UPDATE'
END operation,
map.tran_begin_time as beginT,
map.tran_end_time as endT
FROM cdc.fn_cdc_get_all_changes_dbo_EXT_GeolObject_KategZalezh(sys.fn_cdc_get_min_lsn('dbo_EXT_GeolObject_KategZalezh'), sys.fn_cdc_get_max_lsn(), 'all') AS cz
INNER JOIN [cdc].[lsn_time_mapping] map
ON cz.[__$start_lsn] = map.start_lsn
)
SELECT field, val, operation, beginT, endT FROM History
unpivot ( [val] for field in
(
--category,
area,
oilwidthmin,
oilwidthmax,
efectivwidthmin,
efectivwidthmax,
koafporistmin,
koafporistmax))t where id = '2D166098-7CBD-4622-9EB0-000070506FE6'
Result of the query is following:
But previous result contains extra data. Expected result must be following:
I know that the CDC tracks changes by rows. Or maybe i'm wrong? If no, how can i doing some comparer for val
field in SQL. I have not deep knowledge in t-sql, and everything that comes to my mind is somehow used by cursors. Any ideas?
Maybe somehow use CT (change tracking)? Maybe somehow using group by
?
Almost correct answer. Folowing query return expected result:
WITH History AS (
SELECT
*,
CASE cz.__$operation
WHEN 1 THEN 'DELETE'
WHEN 2 THEN 'INSERT'
WHEN 3 THEN 'Before UPDATE'
WHEN 4 THEN 'After UPDATE'
END operation,
map.tran_begin_time as beginT,
map.tran_end_time as endT
FROM cdc.fn_cdc_get_all_changes_dbo_EXT_GeolObject_KategZalezh(sys.fn_cdc_get_min_lsn('dbo_EXT_GeolObject_KategZalezh'), sys.fn_cdc_get_max_lsn(), 'all') AS cz
INNER JOIN [cdc].[lsn_time_mapping] map
ON cz.[__$start_lsn] = map.start_lsn
where cz.GUID = '2D166098-7CBD-4622-9EB0-000070506FE6'
),
UnpivotedValues AS(
SELECT guid, field, val, operation, beginT, endT
FROM History
UNPIVOT ( [val] FOR field IN
(
area,
oilwidthmin,
oilwidthmax,
efectivwidthmin,
efectivwidthmax,
koafporistmin,
koafporistmax
))t
),
UnpivotedWithLastValue AS (
SELECT
*,
--Use LAG() to get the last value for the same field
LAG(val, 1) OVER (PARTITION BY field ORDER BY BeginT) LastVal
FROM UnpivotedValues
)
--Filter out record where the value equals the last value for the same field
SELECT * FROM UnpivotedWithLastValue WHERE val <> LastVal OR LastVal IS NULL ORDER BY guid
Result of this query looks like this:
But when WHERE cz.GUID =
is absent or if query using several GUIDs in WHERE
predicate i get folowing result:
This result for two GUIDs. Value for LastVal
in first row must be 16691. Like val
from row 4.
You cannot set up CDC to only track values for the columns that have changed. However, you can quite easily filter out the unchanged values in your query.
Consider the following query, which is a simplified copy of your original query:
WITH History AS (
SELECT
*,
CASE cz.__$operation
WHEN 1 THEN 'DELETE'
WHEN 2 THEN 'INSERT'
WHEN 3 THEN 'Before UPDATE'
WHEN 4 THEN 'After UPDATE'
END operation,
map.tran_begin_time as beginT,
map.tran_end_time as endT
FROM cdc.fn_cdc_get_all_changes_Dbo_YourTable(sys.fn_cdc_get_min_lsn('Dbo_YourTable'), sys.fn_cdc_get_max_lsn(), 'all') AS cz
INNER JOIN [cdc].[lsn_time_mapping] map
ON cz.[__$start_lsn] = map.start_lsn
),
UnpivotedValues AS(
SELECT id, field, val, operation, beginT, endT, t.tran_id
FROM History
UNPIVOT ( [val] FOR field IN
(Column1, Column2, Column3))t
),
UnpivotedWithLastValue AS (
SELECT
*,
--Use LAG() to get the last value for the same field
LAG(val, 1) OVER (PARTITION BY id, field ORDER BY BeginT) LastVal
FROM UnpivotedValues
)
--Filter out record where the value equals the last value for the same field
SELECT * FROM UnpivotedWithLastValue WHERE val <> LastVal OR LastVal IS NULL
ORDER BY Id, beginT
In this query I have used the LAG() function to get the last value for each field. Based on this value you can filter out the unchanged records in your final query, and shown above.