Search code examples
sqlinner-joinlimitsybase

Limit result of inner join to 1


I have this schema:

create table ord(id int);
create table orderpos(id int, orderid int, descr varchar(255));

insert into ord(id) values (1);
insert into ord(id) values (2);
insert into orderpos(id, orderid, descr) values(1,1,'huba');
insert into orderpos(id, orderid, descr) values(2,1,'blub');
insert into orderpos(id, orderid, descr) values(3,2,'foo');

So the data in ord looks like:

id
1
2

and in orderpos:

id  orderid descr
1   1       huba
2   1       blub
3   2       foo

I'd like to have:

oId opId    orderid   descr
1   1       1         huba
2   3       2         foo

but with this query:

select o.id as oId, op.id as opId, op.orderid, op.descr from ord o
inner join orderpos op on op.orderid = o.id;

I get:

oId opId    orderid   descr
1   1       1         huba
1   2       1         blub
2   3       2         foo

So somehow I have to limit the innerjoin, but how? I tried severel approaches found via google, but nothing seems to work.

This seems to work in SQL-Editor of W3-Schools:

select o.id as oId, op.id as opId, op.orderid, op.descr 
from ord o
inner join (select * from orderpos op2 group by op2.orderid) op on op.orderid = o.id;

And it seems logical to me that it doesn't work as long as you have different values in 'descr'. No clue why this works in the SQL-Editor on W3Schools.

Sybase version: Adaptive Server Enterprise/15.5/EBF 19902 SMP ESD#5.1/P/x86_64/Enterprise Linux/asear155/2594/64-bit/FBO/Wed Jun 6 01:20:27 2012


Solution

  • Since no logic has been provided for determining which row is the 'first record' (see @HoneyBadger's comments), and you've commented that ordering is not important for your business case, we'll use max() to extract a single row per unique orderid:

    select  o.id        as oId,
            op.id       as opId,
            op.orderid,
            op.descr
    
    from    ord      o,
            orderpos op,
    
            (select op2.orderid,
                    max(op2.descr) as descr
            from    orderpos op2
            group by op2.orderid
            ) dt
    
    where   op.orderid = o.id
    and     op.orderid = dt.orderid
    and     op.descr   = dt.descr
    order by 1,2
    go
    
     oId     opId    orderid  descr
     ------- ------- -------- ----------
           1       1        1 huba
           2       3        2 foo
    

    Your join criteria between ord(o) and orderpos(op) remains the same; the addition of the derived table (dt) allows us to further limit the rows that are of interest from orderpos(op).

    In this case it's just a coincidence that our use of max() generated the output you're looking for. [HINT: replace max() with min() to display '2/blub' instead of '1/huba'.]


    Same idea but using a correlated sub-query instead of a derived table:

    select  o.id        as oId,
            op.id       as opId,
            op.orderid,
            op.descr
    
    from    ord      o,
            orderpos op
    
    where   op.orderid = o.id
    and     op.descr = (select      max(op2.descr)
                            from    orderpos op2
                            where   op2.orderid = op.orderid)
    order by 1,2
    go
    
     oId     opId    orderid  descr
     ------- ------- -------- ----------
           1       1        1 huba
           2       3        2 foo
    

    Alternatively we could have replaced max(op2.descr) with max(op2.id).

    The key issue being to pick some method ... any method in this case ... that allows us to pick a single row from op2 for a given orderid (via the group by op2.orderid).

    NOTE: The proposed solutions work as long a given descr value is unique for a given orderid (eg, you can't have 2x rows in orderpos with the same orderid and descr). If this is an invalid assumption then we'll need more sample data and/or a better description of the data in the orderpos table (eg, pk contraint, unique index, etc).