Search code examples
sqlsql-serverdistinct

Get Distinct rows in query


I have SQL server query to fetch data from SQL Server to Python data frame as below. I need to keep only unique rows based on two columns date & counter

SELECT 
    CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) AS date,
    [A2_TIRE_A_CO_D362_VALUE] AS counter,
    [A2_TIRE_A_CO_D400_VALUE] AS D400,
    [A2_TIRE_A_CO_D402_VALUE] AS D402,
    [A2_TIRE_A_CO_D412_VALUE] AS D412,
    [A2_TIRE_A_CO_D414_VALUE] AS D414,
    [A2_TIRE_A_CO_D416_VALUE] AS D416,
    [A2_TIRE_A_CO_D420_VALUE] AS D420,
    [A2_TIRE_A_CO_D422_VALUE] AS D422,
    [A2_TIRE_A_CO_D432_VALUE] AS D432
FROM 
    [aaaa2_tttt_a].[dbo].[tttt_a] 
WHERE
    [A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour, -1, GETDATE()) 
    AND [A2_TIRE_A_CO_L102_VALUE] = 1 
    AND [A2_TIRE_A_CO_L100_VALUE] = 1
| date       | counter | D400        | D402                         | D412        |
|------------|---------|-------------|------------------------------|-------------|
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |

Requirement

| date       | counter | D400        | D402                         | D412        |
|------------|---------|-------------|------------------------------|-------------|
| 2022-04-23 | 2434    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |
| 2022-04-23 | 2435    | 12848 14393 | 20808 8257 12339 13360 12601 | 16705 16948 |

I tried distinct method and other sources. How can I do it?

I tried to create new column by concatenating date and counter and applying distinct:

DISTINCT(CONCAT(CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE),[A2_TIRE_A_CO_D362_VALUE]) AS unique_column)

but I think it is not correct way


Solution

  • You can try to use and ORDER BY window function of ROW_NUMBER.

    partition by might need to put which column you want to represent unique row.

    SELECT TOP 1 WITH 
        CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) as date,
        [A2_TIRE_A_CO_D362_VALUE] as counter,
        [A2_TIRE_A_CO_D400_VALUE] as D400,
        [A2_TIRE_A_CO_D402_VALUE] as D402,
        [A2_TIRE_A_CO_D412_VALUE] as D412,
        [A2_TIRE_A_CO_D414_VALUE] as D414,
        [A2_TIRE_A_CO_D416_VALUE] as D416,
        [A2_TIRE_A_CO_D420_VALUE] as D420,
        [A2_TIRE_A_CO_D422_VALUE] as D422,
        [A2_TIRE_A_CO_D432_VALUE] as D432
    FROM [aaaa2_tttt_a].[dbo].[tttt_a] 
    where 
        [A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour,  -1,  GETDATE()) 
    and 
        [A2_TIRE_A_CO_L102_VALUE] =1 
    and 
        [A2_TIRE_A_CO_L100_VALUE] =1
    order by row_number() over (partition by CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE),[A2_TIRE_A_CO_D362_VALUE] order by date)
    

    but I think you can use DISTINCT directly from your sample data and expect result.

    SELECT DISTINCT CAST([A2_TIRE_A_CO_D300_TIMESTAMP] AS DATE) as date,
        [A2_TIRE_A_CO_D362_VALUE] as counter,
        [A2_TIRE_A_CO_D400_VALUE] as D400,
        [A2_TIRE_A_CO_D402_VALUE] as D402,
        [A2_TIRE_A_CO_D412_VALUE] as D412,
        [A2_TIRE_A_CO_D414_VALUE] as D414,
        [A2_TIRE_A_CO_D416_VALUE] as D416,
        [A2_TIRE_A_CO_D420_VALUE] as D420,
        [A2_TIRE_A_CO_D422_VALUE] as D422,
        [A2_TIRE_A_CO_D432_VALUE] as D432
    FROM [aaaa2_tttt_a].[dbo].[tttt_a] 
    where 
        [A2_TIRE_A_CO_D300_TIMESTAMP] >= DATEADD(Hour,  -1,  GETDATE()) 
    and 
        [A2_TIRE_A_CO_L102_VALUE] =1 
    and 
        [A2_TIRE_A_CO_L100_VALUE] =1