I have SQL server query to fetch data from SQL Server to Python data frame as below. I need to keep only unique rows based on two columns date & counter
SELECT
CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) AS date,
[A2_TIRE_A_CO_D362_VALUE] AS counter,
[A2_TIRE_A_CO_D400_VALUE] AS D400,
[A2_TIRE_A_CO_D402_VALUE] AS D402,
[A2_TIRE_A_CO_D412_VALUE] AS D412,
[A2_TIRE_A_CO_D414_VALUE] AS D414,
[A2_TIRE_A_CO_D416_VALUE] AS D416,
[A2_TIRE_A_CO_D420_VALUE] AS D420,
[A2_TIRE_A_CO_D422_VALUE] AS D422,
[A2_TIRE_A_CO_D432_VALUE] AS D432
FROM
[aaaa2_tttt_a].[dbo].[tttt_a]
WHERE
[A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour, -1, GETDATE())
AND [A2_TIRE_A_CO_L102_VALUE] = 1
AND [A2_TIRE_A_CO_L100_VALUE] = 1
| date | counter | D400 | D402 | D412 |
|------------|---------|-------------|------------------------------|-------------|
| 2022-04-23 | 2434 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2434 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2434 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
Requirement
| date | counter | D400 | D402 | D412 |
|------------|---------|-------------|------------------------------|-------------|
| 2022-04-23 | 2434 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435 | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
I tried distinct method and other sources. How can I do it?
I tried to create new column by concatenating date and counter and applying distinct:
DISTINCT(CONCAT(CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE),[A2_TIRE_A_CO_D362_VALUE]) AS unique_column)
but I think it is not correct way
You can try to use and ORDER BY
window function of ROW_NUMBER
.
partition by
might need to put which column you want to represent unique
row.
SELECT TOP 1 WITH
CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) as date,
[A2_TIRE_A_CO_D362_VALUE] as counter,
[A2_TIRE_A_CO_D400_VALUE] as D400,
[A2_TIRE_A_CO_D402_VALUE] as D402,
[A2_TIRE_A_CO_D412_VALUE] as D412,
[A2_TIRE_A_CO_D414_VALUE] as D414,
[A2_TIRE_A_CO_D416_VALUE] as D416,
[A2_TIRE_A_CO_D420_VALUE] as D420,
[A2_TIRE_A_CO_D422_VALUE] as D422,
[A2_TIRE_A_CO_D432_VALUE] as D432
FROM [aaaa2_tttt_a].[dbo].[tttt_a]
where
[A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour, -1, GETDATE())
and
[A2_TIRE_A_CO_L102_VALUE] =1
and
[A2_TIRE_A_CO_L100_VALUE] =1
order by row_number() over (partition by CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE),[A2_TIRE_A_CO_D362_VALUE] order by date)
but I think you can use DISTINCT
directly from your sample data and expect result.
SELECT DISTINCT CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) as date,
[A2_TIRE_A_CO_D362_VALUE] as counter,
[A2_TIRE_A_CO_D400_VALUE] as D400,
[A2_TIRE_A_CO_D402_VALUE] as D402,
[A2_TIRE_A_CO_D412_VALUE] as D412,
[A2_TIRE_A_CO_D414_VALUE] as D414,
[A2_TIRE_A_CO_D416_VALUE] as D416,
[A2_TIRE_A_CO_D420_VALUE] as D420,
[A2_TIRE_A_CO_D422_VALUE] as D422,
[A2_TIRE_A_CO_D432_VALUE] as D432
FROM [aaaa2_tttt_a].[dbo].[tttt_a]
where
[A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour, -1, GETDATE())
and
[A2_TIRE_A_CO_L102_VALUE] =1
and
[A2_TIRE_A_CO_L100_VALUE] =1