Search code examples
sqliteselectfilterkey-valueentity-attribute-value

SQL query to find articles having some properties and missing others


In SQLite, I want to write a SELECT query to find articles, that have certain properties+value or that does NOT have certain properties, where properties are in another table.

This is how I've organized a database I've got:

Articles:

ID  Name
1   Paper clip
2   Water hose
3   Rubber duck

Features
ID    Articles_ID  Name    Value
1     1            Color   White
2     2            Color   Yellow
3     2            Length  1.4m
4     3            Color   Yellow

If I would like to find articles with the color Yellow I could do:

SELECT distinct a.Name from Articles a join Features f
             where a.ID = f.Articles_ID
               and f.Name = "Color"
               and f.Value = "Yellow";

But what if I would like to find articles with the color yellow, but without any Length feature. I.e. I want the rubber duck, since it has no Length, but I don't want the water hose.

In my UI one can choose:

Color: [ ] <empty>
       [x] Yellow
       [ ] White
       [ ] ...

Length: [x] <empty>
        [ ] 0.7m
        [ ] 1.4m
        [ ] ...

My articles table has ~20k rows and features ~200k.

Perhaps my database not suitable for this kind of queries? I could easily regenerate it if necessary.


Solution

  • (Note: you probably need the article IDs, in which case you don't need DISTINCT; and the syntax for joins should be tableA JOIN tableB ON condition.)

    There are three possibilities: You can get all yellow articles, and then exclude all articles that have a length:

    SELECT a.ID,
           a.Name
    FROM Articles a
    JOIN Features f ON a.ID = f.Articles_ID
    WHERE f.Name = 'Color'
      AND f.Value = 'Yellow'
    EXCEPT
    SELECT a.ID,
           a.Name
    FROM Articles a
    JOIN Features f ON a.ID = f.Articles_ID
    WHERE f.Name = 'Length'
    

    Alternatively, use a subquery to match records for which corresponding Length records do not exist:

    SELECT a.ID,
           a.Name
    FROM Articles a
    JOIN Features f ON a.ID = f.Articles_ID
    WHERE f.Name = 'Color'
      AND f.Value = 'Yellow'
      AND NOT EXISTS (SELECT ID
                      FROM Features f2
                      WHERE f2.Articles_ID = a.ID
                        AND f2.Name = 'Length')
    

    Alternatively, use a LEFT JOIN to join with all Length features, and match those records where such a join did not succeed. (With the Color condition in the first feature join, this query is the most regular.)

    SELECT a.ID,
           a.Name
    FROM Articles a
    JOIN      Features f1 ON  a.ID = f1.Articles_ID
                          AND f1.Name = 'Color'
    LEFT JOIN Features f2 ON  a.ID = f2.Articles_ID
                          AND f2.Name = 'Length'
    WHERE f1.Value = 'Yellow'
      AND f2.Value IS NULL
    

    Which query is most efficient depends on what indexes you have and whether SQLite decides to use them. Use EXPLAIN QUERY PLAN to check.