I'm trying to match two tables where one of the tables stores multiple values as a string.
In the example below I need to classify each product ordered from the #Orders table with a #NewProduct.NewProductId.
The issue I'm having is sometimes we launch a new product like "Black Shirt", then later we launch an adaption to that product like "Black Shirt Vneck".
I need to match both changes correctly to the #Orders table. So if the order has Black and Shirt, but not Vneck, it's considered a "Black Shirt", but if the order has Black and Shirt and Vneck, it's considered a "Black Vneck Shirt."
The code below is an example - the current logic I'm using returns duplicates with the Left Join. Also, assume we can modify the format of #NewProducts but not #Orders.
IF OBJECT_ID('tempdb.dbo.#NewProducts') IS NOT NULL DROP TABLE #NewProducts
CREATE TABLE #NewProducts
(
ProductType VARCHAR(MAX)
, Attribute_1 VARCHAR(MAX)
, Attribute_2 VARCHAR(MAX)
, NewProductId INT
)
INSERT #NewProducts
VALUES
('shirt', 'black', 'NULL', 1),
('shirt', 'black', 'vneck', 2),
('shirt', 'white', 'NULL', 3)
IF OBJECT_ID('tempdb.dbo.#Orders') IS NOT NULL DROP TABLE #Orders
CREATE TABLE #Orders
(
OrderId INT
, ProductType VARCHAR(MAX)
, Attributes VARCHAR(MAX)
)
INSERT #Orders
VALUES
(1, 'shirt', 'black small circleneck'),
(2, 'shirt', 'black large circleneck'),
(3, 'shirt', 'black small vneck'),
(4, 'shirt', 'black small vneck'),
(5, 'shirt', 'white large circleneck'),
(6, 'shirt', 'white small vneck')
SELECT *
FROM #Orders o
LEFT JOIN #NewProducts np
ON o.ProductType = np.ProductType
AND CHARINDEX(np.Attribute_1, o.Attributes) > 0
AND (
CHARINDEX(np.Attribute_2, o.Attributes) > 0
OR np.Attribute_2 = 'NULL'
)
You seem to want the longest overlap:
SELECT *
FROM #Orders o OUTER APPLY
(SELECT Top (1) np.*
FROM #NewProducts np
WHERE o.ProductType = np.ProductType AND
CHARINDEX(np.Attribute_1, o.Attributes) > 0
ORDER BY ((CASE WHEN CHARINDEX(np.Attribute_1, o.Attributes) > 0 THEN 1 ELSE 0 END) +
(CASE WHEN CHARINDEX(np.Attribute_2, o.Attributes) > 0 THEN 1 ELSE 0 END)
) DESC
) np;
I can't say I'm thrilled with the need to do this. It seems like the Orders
should contain numeric ids that reference the actual product. However, I can see how something like this is sometimes necessary.