I have two tables, one with sales information and the other with shipping information. Both are at the item level for each order. In other words, each record, in both tables, will only have a quantity of 1 even if there is multiple of the item within the order.
Table 1
order_number | product_code | cost | currency |
---|---|---|---|
100 | aa | $10 | USD |
100 | aa | $10 | USD |
101 | bb | $15 | USD |
Table 2
order_number | product_code | carrier_service | shipment_cost |
---|---|---|---|
100 | aa | Carrier A | $7 |
100 | aa | Carrier B | $7 |
101 | bb | Carrier C | $13 |
I need to join the tables in order to pull in aspects from each. The challenge is there is not a unique identifier for each record. I initially tried to join the two tables on order_number and product_code however this led to duplicated results (because the lack of uniqueness).
I then tried to add ROW_NUMBER() OVER (PARTITION... to assign a row number and tried to add that condition to the join but I am still failing. The values from the second CTE are not pulling through to the final results but if you were to run it separately, the values populate.
WITH ss AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY product_code) AS row_id,
order_number,
product_code,
cost,
currency,
FROM sales
ORDER BY order_number, product_code, ROW_NUMBER() OVER (PARTITION BY product_code)),
sis AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY product_code) AS row_id,
order_number,
product_code,
carrier_service,
shipment_cost
FROM items
GROUP BY 1,2,3,4,5,6
ORDER BY order_number, product_code, ROW_NUMBER() OVER (PARTITION BY product_code))
Select
ss.order_number,
ss.product_code,
ss.cost
ss.currency,
sis.carrier_service,
sis.shipment_cost,
FROM ss
LEFT JOIN sis
ON ss.order_number=sis.order_number AND ss.product_code=sis.product_code and ss.row_id=sis.row_id
ORDER BY ss.order_number, ss.product_code
Desired Result
order_number | product_code | cost | currency | carrier_service | shipment_cost |
---|---|---|---|---|---|
100 | aa | $10 | USD | Carrier A | $7 |
100 | aa | $10 | USD | Carrier B | $7 |
101 | bb | $15 | USD | Carrier C | $13 |
Actual Result
order_number | product_code | cost | currency | carrier_service | shipment_cost |
---|---|---|---|---|---|
100 | aa | $10 | USD | Null | Null |
100 | aa | $10 | USD | Null | Null |
101 | bb | $15 | USD | Null | Null |
Am I unable to join on row numbers? Or is there a better approach to this problem?
consider below
select order_number, product_code, cost, currency, carrier_service, shipment_cost
from (select *, row_number() over(partition by product_code) rn from table1) t1
left join (select *, row_number() over(partition by product_code) rn from table2) t2
using(order_number, product_code, rn)
order by order_number, product_code
if applied to sample data in your question - output is
Please note: even though I used over(partition by product_code)
as it is in your code - I feel that it should rather be over(partition by order_number)
- but it is not clear based on your particular question - so I am leaving it up to you while addressing the issue of left join that is the focus of your question