I have a table where multiple rows are duplicates because of two date columns which their value are different.
I want to know if it is accepted to use FIRST_VALUE in both columns like this, to remove duplicate on specified columns:
SELECT
EmployeeName,
FIRST_VALUE(StartDateTime) OVER(ORDER BY UpdatedDateTime DESC) AS StartDateTime,
FIRST_VALUE(UpdatedDateTime) OVER(ORDER BY UpdatedDateTime DESC) AS UpdatedDateTime
FROM @Employees;
If you need to remove duplicates over some field, you need to use ROW_NUMBER()
and CTE:
-- Sample data: dates duplicates
declare @t table (id int, dt date);
insert into @t values
(1, '2018-01-14'),
(1, '2018-01-14'),
(1, '2018-01-15'),
(1, '2018-01-15');
with cte as (
select *,
-- assign row number for each partition consisting of same date
row_number() over (partition by dt order by dt) as cnt
from @t
)
-- we're interested only in one row (i.e. first)
select id, dt from cte where cnt = 1;
/*
Output:
+-------+---------+
| id | dt |
+----+------------+
| 1 | 2018-01-14 |
|----|------------|
| 1 | 2018-01-15 |
+----+------------+
*/