Search code examples
sql-servert-sqljoinunique

How to join two Tables which have no unique ID


I am trying to Left join Table 1 to table 2

Table1          Table2  
ID  Data        ID  Data2
1   r           1   q
2   t           1   a
3   z           2   x
1   u           3   c

After i have left joined this two Tables i would like get something like this

  Table1+2      
ID  Data    Data2
1   r        a
2   t        x
3   z        c
1   u        q

and NOT

  Table1+2      
ID  Data    Data2
1   r        q
2   t        x
3   z        c
1   u        q

and my question is: is there any possibility to tell table 2 that if u have used something for table 1, dont use it and give me next value. do i have to make it im T-SQL or to and new column where i can list if this id exists then write 2 if not 1(Number data). How can i solve this problem? Ty u in advance.


Solution

  • Since there's no uniqueness in the ID on both tables, lets add some uniqueness to it.
    So it can be used to join on.
    The window function ROW_NUMBER can be used for that.

    An example solution that gives the expected result:

    DECLARE @TestTable1 TABLE (ID INT, Data VARCHAR(1));
    DECLARE @TestTable2 TABLE (ID INT, Data VARCHAR(1));
    
    INSERT INTO @TestTable1 VALUES (1,'r'),(2,'t'),(3,'z'),(1,'u');         
    INSERT INTO @TestTable2 VALUES (1,'q'),(1,'a'),(2,'x'),(3,'c');
    
    select 
     t1.ID, t1.Data,
     t2.Data as Data2
    from (
      select ID, Data, 
      row_number() over (partition by ID order by Data) as rn
      from @TestTable1
    ) t1
    left join (
      select ID, Data, 
      row_number() over (partition by ID order by Data) as rn
      from @TestTable2
    ) t2 on t1.ID = t2.ID and t1.rn = t2.rn;
    

    Note: Because of the LEFT JOIN, this does assume the amount of same ID's in table2 are equal or lower can those on table1. But you can change that to a FULL JOIN if that's not the case.

    Returns :

    ID  Data Data2
    1   r    a
    1   u    q
    2   t    x
    3   z    c
    

    To get the other result could have been achieved in different ways.
    This is actually a more common situation.
    Where one wants all from Table 1, but only get one value from Table 2 for each record of Table 1.

    1) A top 1 with ties in combination with a order by rownumber()

    select top 1 with ties 
     t1.ID, t1.Data,
     t2.Data as Data2
    from @TestTable1 t1
    left join @TestTable2 t2 on t1.ID = t2.ID
    order by row_number() over (partition by t1.ID, t1.Data order by t2.Data desc);
    

    The top 1 with ties will only show those where the row_number() = 1

    2) Using the row_number in a subquery:

    select ID, Data, Data2
    from (
        select 
         t1.ID, t1.Data,
         t2.Data as Data2,
         row_number() over (partition by t1.ID, t1.Data order by t2.Data desc) as rn
        from @TestTable1 t1
        left join @TestTable2 t2 on t1.ID = t2.ID
    ) q
    where rn = 1;
    

    3) just a simple group by and a max :

    select t1.ID, t1.Data, max(t2.Data) as Data2
    from @TestTable1 t1
    left join @TestTable2 t2 on t1.ID = t2.ID
    group by t1.ID, t1.Data
    order by 1,2;
    

    All 3 give the same result:

    ID Data Data2
    1  r    q
    1  u    q
    2  t    x
    3  z    c