Search code examples
sqlsql-serveruniondistinct

Select distinct record from union query


I have multiple tables with OS_Data from different application databases and scan engines, which I've given the data a priority based on the clarity of the data. I want to get a list of unique hostnames with the lowest priority and the relevant OS of that priority

Table1

Hostname Operating_System OS_Priority
ServerA Windows Server 2016 1
ServerB Windows 2
ServerC Windows Server 2016 1
ServerE Windows Server 2016 1

Table2

Hostname Operating_System OS_Priority
ServerA Windows 2
ServerB Windows Server 2012 R2 1
ServerC Windows Server 2016 1809 1
ServerD Unknown 3
ServerE Unknown 3

Table3

Hostname Operating_System OS_Priority
ServerA Windows Server 2016 1809 1
ServerC Windows 2
ServerE Windows Server 2016 1809 1

Expected Result

Hostname Operating_System
ServerA Windows Server 2016
ServerB Windows Server 2012 R2
ServerC Windows Server 2016 1809
ServerD Unknown
ServerE Windows Server 2016 1809

I just need the top distinct Hostname along with it's Operating_System. If there is a tie, then it doesn't matter which result I get, I just need 1 record

Potentially I could be adding five or six tables together but just using three for the example.

This will give me a list of distinct records

SELECT Hostname, Operating_System, OS_Priority FROM [Table1] 
UNION
SELECT Hostname, Operating_System, OS_Priority FROM[Table2]
UNION
SELECT Hostname, Operating_System, OS_Priority FROM[Table3]
WHERE Hostname is not null
ORDER BY Hostname, OS_Priority

I've tried using MIN(OS_Priority) and GROUP BY Hostname but I can't get results to include Operating_System.

Any pointers in the right direction would be greatly appreciated. Thanks in advance


Solution

  • This is a job for ROW_NUMBER.

    Wrap your union a subquery and add a ROW_NUMBER() OVER (PARTITION BY HostName ORDER BY OS_priority), and finally only get the top row:

    SELECT *
    FROM (
      SELECT *
        , ROW_NUMBER() OVER (PARTITION BY HostName ORDER BY OS_priority) AS sort
      FROM (
        SELECT Hostname, Operating_System, OS_Priority
        FROM [Table1] 
        UNION ALL
        SELECT Hostname, Operating_System, OS_Priority
        FROM [Table2]
        UNION ALL
        SELECT Hostname, Operating_System, OS_Priority
        FROM [Table3]
        WHERE Hostname IS NOT NULL
      ) x
    ) x
    WHERE x.sort = 1