Search code examples
mysqldatabaserelational-databaseone-to-many

mysql one-to-many query with negation and/or multiple criteria


I thought a query like this would be pretty easy because of the nature of relational databases but it seems to be giving me a fit. I also searched around but found nothing that really helped. Here's the situation:

Let's say I have a simple relationship for products and product tags. This is a one-to-many relationship, so we could have the following:

productid  |  tag
========================
1          |  Car
1          |  Black
1          |  Ford
2          |  Car
2          |  Red
2          |  Ford
3          |  Car
3          |  Black
3          |  Lexus
4          |  Motorcycle
4          |  Black
5          |  Skateboard
5          |  Black
6          |  Skateboard
6          |  Green

What's the most efficient way to query for all (Ford OR Black OR Skateboard) AND NOT (Motorcycles OR Green)? Another query I'm going to need to do is something like all (Car) or (Skateboard) or (Green AND Motorcycle) or (Red AND Motorcycle).

There are about 150k records in the products table and 600k records in the tags tables, so the query is going to need to be as efficient as possible. Here's one query that I've been messing around with (example #1), but it seems to be taking about 4 seconds or so. Any help would be much appreciated.

SELECT p.productid
FROM   products p
       JOIN producttags tag1 USING (productid)
WHERE  p.active = 1
       AND tag1.tag IN ( 'Ford', 'Black', 'Skatebaord' )
       AND p.productid NOT IN (SELECT productid
                               FROM   producttags
                               WHERE  tag IN ( 'Motorcycle', 'Green' ));

 

Update

The quickest query I've found so far is something like this. It's taking 100-200ms but it seems pretty inflexible and ugly. Basically I'm grabbing all products that match Ford, Black, or Skateboard. Them I'm concatenating all of the tags for those matched products into a colon-separated string and removing all products that match on :Green: AND :Motorcycle:. Any thoughts?

SELECT p.productid,
       Concat(':', Group_concat(alltags.tag SEPARATOR ':'), ':') AS taglist
FROM   products p
       JOIN producttags tag1 USING (productid)
       JOIN producttags alltags USING (productid)
WHERE  p.active = 1
       AND tag1.tag IN ( 'Ford', 'Black', 'Skateboard' )
GROUP  BY tag1.productid
HAVING ( taglist NOT LIKE '%:Motorcycle:%'
         AND taglist NOT LIKE '%:Green:%' ); 

Solution

  • I'd write the exclusion join with no subqueries:

    SELECT p.productid
    FROM   products p
    INNER JOIN producttags AS t ON p.productid = t.productid
    LEFT OUTER JOIN producttags AS x ON p.productid = x.productid 
           AND x.tag IN ('Motorcycle', 'Green')
    WHERE  p.active = 1
           AND t.tag IN ( 'Ford', 'Black', 'Skateboard' )
           AND x.productid IS NULL;
    

    Make sure you have an index on products over the two columns (active, productid) in that order.

    You should also have an index on producttags over the two columns (productid, tag) in that order.

    Another query I'm going to need to do is something like all (Car) or (Skateboard) or (Green AND Motorcycle) or (Red AND Motorcycle).

    Sometimes these complex conditions are hard for the MySQL optimizer. One common workaround is to use UNION to combine simpler queries:

    SELECT p.productid
    FROM   products p
    INNER JOIN producttags AS t1 ON p.productid = t1.productid
    WHERE  p.active = 1
       AND t1.tag IN ('Car', 'Skateboard')
    
    UNION ALL
    
    SELECT p.productid
    FROM   products p
    INNER JOIN producttags AS t1 ON p.productid = t1.productid
    INNER JOIN producttags AS t2 ON p.productid = t2.productid 
    WHERE  p.active = 1
       AND t1.tag IN ('Motorcycle')
       AND t2.tag IN ('Green', 'Red');
    

    PS: Your tagging table is not an Entity-Attribute-Value table.