Search code examples
databasepostgresqljoinrecursive-queryrow-number

Write a conditional SQL query with certain condition to find an item not available in one but can be found in other


Table t1 contains data of those product which is not available to those retailer for a given company while table t2 contains master data of retailer-item and its availability status.

I want to write a sql query to find out- those items which is not available to any retailer (i.e. in table t1) , we have to find is there any other retailer who can provide the same item for that same company(i.e. in table t2).

Note:- For a given company , there are many retailers and if any item which we are finding is available to more than 1 retailer then we will choose those retailer who has most quantity of that item available

Ex:-

  1. In t1- retailer a and product a3 - it is available in both b and c in t2 but the output should contain retailer c as it has more quantity available.

2.In t1- retailer b and product b4 - it is available in both a and c in t2 but for retailer- a currently its availability is not there, so, the output should contain retailer c.

** Insert Sql Script**

t1:-

CREATE TABLE t1 
(
    company VARCHAR(512),
    retailer    VARCHAR(512),
    product VARCHAR(512),
    availability    VARCHAR(512)
);

INSERT INTO t1 (company, retailer, product, availability) VALUES
    ('1', 'a', 'a1', '0'),
    ('1', 'a', 'a2', '0'),
    ('1', 'a', 'a3', '0'),
    ('1', 'b', 'b1', '0'),
    ('1', 'b', 'b2', '0'),
    ('1', 'b', 'b3', '0'),
    ('1', 'b', 'b4', '0'),
    ('1', 'c', 'c1', '0'),
    ('1', 'c', 'c2', '0');

t2:-

CREATE TABLE t2 
(
    company VARCHAR(512),
    retailer    VARCHAR(512),
    product VARCHAR(512),
    availability    VARCHAR(512),
    quantity_available VARCHAR(512)
);

INSERT INTO t2 (company, retailer, product, availability, quantity_available) VALUES
    ('1', 'a', 'a1', '0', '0'),
    ('1', 'a', 'a2', '0', '0'),
    ('1', 'a', 'a3', '0', '10'),
    ('1', 'a', 'b2', '1', '30'),
    ('1', 'a', 'b4', '0', '0'),
    ('1', 'a', 'c1', '1', '35'),
    ('1', 'a', 'b3', '0', '0'),
    ('1', 'b', 'b1', '0', '0'),
    ('1', 'b', 'b2', '0', '0'),
    ('1', 'b', 'b3', '0', '0'),
    ('1', 'b', 'b4', '0', '0'),
    ('1', 'b', 'a3', '1', '25'),
    ('1', 'b', 'c2', '1', '30'),
    ('1', 'b', 'c1', '1', '50'),
    ('1', 'c', 'c1', '0', '0'),
    ('1', 'c', 'c2', '0', '0'),
    ('1', 'c', 'a1', '1', '40'),
    ('1', 'c', 'b2', '1', '5'),
    ('1', 'c', 'b1', '1', '20'),
    ('1', 'c', 'b4', '1', '10'),
    ('1', 'c', 'a3', '1', '15');

** Sample Data**

t1:-

company retailer product availability
1 a a1 0
1 a a2 0
1 a a3 0
1 b b1 0
1 b b2 0
1 b b3 0
1 b b4 0
1 c c1 0
1 c c2 0

t2:-

company retailer product availability Quantity_available
1 a a1 0 0
1 a a2 0 0
1 a a3 0 10
1 a b2 1 30
1 a b4 0 0
1 a c1 1 35
1 a b3 0 0
1 b b1 0 0
1 b b2 0 0
1 b b3 0 0
1 b b4 0 0
1 b a3 1 25
1 b c2 1 30
1 b c1 1 50
1 c c1 0 0
1 c c2 0 0
1 c a1 1 40
1 c b2 1 5
1 c b1 1 20
1 c b4 1 10
1 c a3 1 15

expected output

Company Retailer Product Corresponding Retailer
1 a a1 c
1 a a2 na
1 a a3 c
1 b b1 c
1 b b2 a
1 b b3 na
1 b b4 c
1 c c1 b
1 c c2 b

** what I have tried**

with cte1 as (select * from t2 where availability=1)
select * from cte1 left join t1 on  cte1.product=t1.product

I am getting issue when same product is available in multiple retailer , so basically not able to iterate over one retailer and predict the ranking according to quantity at a time then move to other if not found and so.

FYI- My database support PostgreSQL.

Please help me out, Thanks.


Solution

  • The following does what you want (at least with your sample data):

    WITH cte AS ( 
        SELECT DISTINCT ON (product) product, retailer, company
        FROM t2
        WHERE availability = '1'
        ORDER BY product, quantity_available DESC
        )
    SELECT t.company, t.retailer, t.product, 
    COALESCE(c.retailer, 'na') as "Corresponding_Retailer" 
    FROM t1 t
    LEFT JOIN cte c ON c.company = t.company AND t.product = c.product
    WHERE t.availability = '0';
    

    Using your data, it gives these results:

    company retailer product Corresponding_Retailer
    1 a a1 c
    1 a a2 na
    1 a a3 b
    1 b b1 c
    1 b b2 c
    1 b b3 na
    1 b b4 c
    1 c c1 b
    1 c c2 b

    Note that this is the same as your expected outcome with the exception of retailer a, product a3, where the corresponding retailer is b not c. However, I consider this to be correct, since in your sample data, b has 25 of a3, whereas c has 15.

    By way of explanation, the key element here is the combination of DISTINCT ON and ORDER BY. This effectively returns for each product the retailer with the most quantity_available - given ORDER BY DESC. Note the column with the DISTINCT ON must be the first column in the ORDER BY. It is then a simple matter to LEFT JOIN on these results.

    By the way, please do not use varchars for every column! Quantities should be integers and your availability columns should be booleans.

    With a full set of data you may need to use SELECT DISTINCT ON (product, company) with ORDER BY product, copmany, quantity_available, depending on whether the same product is in more than one company.