i have a recursive table of category and a company table with fields like:
category(id, name, parent) // parent is foreign key to category id :)
company(id, category_1, category_2, category_3) // category_* is foreign key to category id
category tree is at max depth=3 ;
category cx -> category cy -> category cz
with knowledge of company categories are always linked to last category (c3), i want all categories that a company is linked to (c1z, c2z, c3z, c1y, c2y, c3y, c1x, c2x, c3x) for my search engine. //c1y is the parent of category_1 and c1x is parent of parent of category 1...
the best query i came up with is :
SELECT
ID,
NAME
FROM category c3
WHERE ID IN (
select category_1 from company where id=:companyId
union
select category_2 from company where id=:companyId
union
select category_3 from company where id=:companyId
union
select parent from category where id in (
select category_1 from company where id=:companyId
union
select category_2 from company where id=:companyId
union
select category_3 from company where id=:companyId
)
union
select parent from category where id in (
select parent from category where id in (
select category_1 from company where id=:companyId
union
select category_2 from company where id=:companyId
union
select category_3 from company where id=:companyId
)
)
)
it has so much duplicate in it. one for category_* in company. and one for repeating it multiple times.
any way to remove all this duplicates ?
--update--
suppose we solve the category-* field with using two tables what about recursive problem with 3 levels of category?
for example if there is only one category it would look like
SELECT
ID,
NAME
FROM category
WHERE ID IN (
select category_1 from company where id=:companyId
union
select parent from category where id in (
select category_1 from company where id=:companyId
)
union
select parent from category where id in (
select parent from category where id in (
select category_1 from company where id=:companyId
)
)
);
If you want to join data, use something like this (SQL server example):
DECLARE @category TABLE (id INT IDENTITY(1,1), name VARCHAR(30), parent INT) -- parent is foreign key to category id :)
DECLARE @company TABLE (id INT IDENTITY(1,1), category_1 INT, category_2 INT, category_3 INT) --category_* is foreign key to category->id
INSERT INTO @category (name, parent )
VALUES('Top category', null), ('Cars', 1)
INSERT INTO @company (category_1, category_2 , category_3 )
VALUES(2, null, null), (2, 2, null), (2, 2, 2)
SELECT t1.*, t2.*
FROM @category AS t1 INNER JOIN @company AS t2 ON t1.id = t2.category_1 or t1.id = t2.category_2 or t1.id = t2.category_3
Above code produces:
id name parent id category_1 category_2 category_3
2 Cars 1 1 2 NULL NULL
2 Cars 1 2 2 2 NULL
2 Cars 1 3 2 2 2
But, such kind of database structure is wrong!
Instead of one table
company(id, category_1, category_2, category_3)
create two tables
company(id, name)
comp_cat(id, comp_id, cat_id)
Why? I don't want to answer directly, so i ask you: 1) what happens when company is related to more than 3 categories? 2) why to save nulls in case when second and third category is not set?
In case of SQL Server, you can use Common Table Expressions:
;WITH CTE AS
(
SELECT id, category_1 AS cat_id
FROM @company
WHERE NOT category_1 IS NULL
UNION ALL
SELECT id, category_2 AS cat_id
FROM @company
WHERE NOT category_2 IS NULL
UNION ALL
SELECT id, category_3 AS cat_id
FROM @company
WHERE NOT category_3 IS NULL
)
SELECT DISTINCT t1.*, t2.*
FROM CTE AS t1 INNER JOIN @category AS t2 ON t1.cat_id = t2.id
Cheers, Maciej