sql optimization with anti join

i have a recursive table of category and a company table with fields like:

category(id, name, parent) // parent is foreign key to category id :)
company(id, category_1, category_2, category_3) // category_* is foreign key to category id

category tree is at max depth=3 ;

category cx -> category cy -> category cz

with knowledge of company categories are always linked to last category (c3), i want all categories that a company is linked to (c1z, c2z, c3z, c1y, c2y, c3y, c1x, c2x, c3x) for my search engine. //c1y is the parent of category_1 and c1x is parent of parent of category 1...

the best query i came up with is :

SELECT
  ID,
  NAME
FROM category c3
WHERE ID IN (
    select category_1 from company where id=:companyId
  union
    select category_2 from company where id=:companyId
  union
    select category_3 from company where id=:companyId
  union
        select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
        )
  union 
        select parent from category where id in (
          select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
          )
        )
  )

it has so much duplicate in it. one for category_* in company. and one for repeating it multiple times.

any way to remove all this duplicates ?

--update--

suppose we solve the category-* field with using two tables what about recursive problem with 3 levels of category?

for example if there is only one category it would look like

SELECT
  ID,
  NAME
FROM category
WHERE ID IN (
  select category_1 from company where id=:companyId
  union
  select parent from category where id in (
    select category_1 from company where id=:companyId
  )
  union
  select parent from category where id in (
    select parent from category where id in (
      select category_1 from company where id=:companyId
    )
  )
);

Solution

If you want to join data, use something like this (SQL server example):

DECLARE @category TABLE (id INT IDENTITY(1,1), name VARCHAR(30), parent INT) -- parent is foreign key to category id :)
DECLARE @company TABLE (id INT IDENTITY(1,1), category_1 INT, category_2 INT, category_3 INT) --category_* is foreign key to category->id


INSERT INTO @category (name, parent )
VALUES('Top category', null), ('Cars', 1)

INSERT INTO @company (category_1, category_2 , category_3 )
VALUES(2, null, null), (2, 2, null), (2, 2, 2)


SELECT t1.*, t2.*
FROM @category AS t1 INNER JOIN @company AS t2 ON t1.id = t2.category_1 or t1.id = t2.category_2  or t1.id = t2.category_3

Above code produces:

id  name    parent  id  category_1  category_2  category_3
2   Cars    1   1   2   NULL    NULL
2   Cars    1   2   2   2   NULL
2   Cars    1   3   2   2   2

But, such kind of database structure is wrong!

Instead of one table

company(id, category_1, category_2, category_3)

create two tables

company(id, name)
comp_cat(id, comp_id, cat_id)

Why? I don't want to answer directly, so i ask you: 1) what happens when company is related to more than 3 categories? 2) why to save nulls in case when second and third category is not set?

In case of SQL Server, you can use Common Table Expressions:

;WITH CTE AS
(
    SELECT id, category_1 AS cat_id
    FROM @company 
    WHERE NOT category_1 IS NULL
    UNION ALL
    SELECT id, category_2 AS cat_id
    FROM @company 
    WHERE NOT category_2 IS NULL
    UNION ALL
    SELECT id, category_3 AS cat_id
    FROM @company 
    WHERE NOT category_3 IS NULL
)
SELECT DISTINCT t1.*, t2.*
FROM CTE AS t1 INNER JOIN @category AS t2 ON t1.cat_id = t2.id

Cheers, Maciej