Search code examples
sqlgroup-byduplicatesconcatenation

SQL de-duplication of flight path combinations: Same Origin + '-' + Destination & Destination + '-' + Origin must be treated as duplicate


I'm trying to figure out how to crack this requirement. Here is the Flight table

Origin Destination
LAX PDX
NYC SEA
PDX LAX
SEA NYC

I need to group the data on LAX-PDX route with PDX-LAX route (no preference to which Origin or Destination is first in the string) and same for the data associated with SEA-NYC and NYC-SEA routes. Can you help me figure out how to treat those routes that the Origin and Destination are swapped as duplicates so I can do a GROUP BY on single route (No preference on which route survives as long as the outcome is either LAX-PDX or PDX-LAX)?


Solution

  • you would group by the concatenation logic

    select case when origin> destination then 
                concat(origin,'-',destination)
                else 
                concat(destination,'-',origin)
            end as grp_col
           ,count(*) as no_of_routes
       from table
    group by case when origin> destination then 
                concat(origin,'-',destination)
                else 
                concat(destination,'-',origin)
            end
    having count(*)>1