Search code examples
sqldatabasesqlitemultiple-conditions

SQL left join with n conditions


Hi i got this database with cars and every car has options

I got a cars table and a options table. Link one to many, a car has n options a option has exactly one car.

I use the database to filter for cars with specific options (combination of options).

I am working with SQLite.

My query looks like this

SELECT DISTINCT * FROM cars WHERE cars.id IN (SELECT cars.id 
                    FROM cars
                    LEFT JOIN options AS option1 ON cars.id = option1.car_id 
                    LEFT JOIN options AS option2 ON cars.id = option2.car_id 
                    LEFT JOIN options AS option3 ON cars.id = option3.car_id 
                    LEFT JOIN options AS option4 ON cars.id = option4.car_id 
                    LEFT JOIN options AS option5 ON cars.id = option5.car_id 
                    LEFT JOIN options AS option6 ON cars.id = option6.car_id 
                    WHERE cars.vin LIKE '%EA40%'
                        AND option1.options_number = 415
                        AND option2.options_number = 362
                        AND option3.options_number = 502
                        AND option4.options_description like "Bordcomputer"
                        AND option5.options_number = 494
                        AND option6.options_number = 530
                        ) ;

This is just a hobby project and i am not a professional, this solution while it is working looks and feels very wrong to me with all those joins.

I hope someone can give me a hint in a more streamlined shorter version.

My goal would be to make n options searchable.


Solution

  • The DISTINCT is superfluous. There must be no duplicate rows in a car table. One row per car. No duplicates. Whenever you use DISTINCT ask yourself what forces you to use it. DISTINCT is a keyword that you should need very rarely. (Often DISTINCT is an indicator even for a poorly written query.)

    You don't need outer joins. If you only want cars that have the options, it must be inner joins. With outer joins you'd get all cars, because the outer join tells the DBMS: "give me all cars regardless whether you find a match in the option table or not". Your "outer" joins however are obfuscated inner joins (this is why the query works for you). In a result row that had no match in the outer joined table, the outer joined table's columns are all null. A WHERE o1.options_number = 415 dismisses such rows, so you turn the outer join into an inner join. Don't do that. Only outer join when you want to outer join. And if you outer join a table, put the conditions for the join into the ON clause.

    Moreover WHERE options_description like "Bordcomputer" is invalid SQL unless you have a column called "Bordcomputer". String literals have single quotes: 'Bordcomputer'. And then, why LIKE? What is this supposed to do? LIKE is for pattern matching, e.g. WHERE options_description LIKE 'B_rdcom%'. You want equality, so use =, not LIKE.

    At last, why the subquery? Why select all car IDs from the car table in order to select the car rows from the car table again?

    Your query corrected:

    SELECT c.*
    FROM cars c
    INNER JOIN options AS o1 ON o1.car_id = c.id AND o1.options_number = 415
    INNER JOIN options AS o2 ON o2.car_id = c.id AND o2.options_number = 362
    INNER JOIN options AS o3 ON o3.car_id = c.id AND o3.options_number = 502
    INNER JOIN options AS o4 ON o4.car_id = c.id AND o4.options_description = 'Bordcomputer'
    INNER JOIN options AS o5 ON o5.car_id = c.id AND o5.options_number = 494
    INNER JOIN options AS o6 ON o6.car_id = c.id AND o6.options_number = 530
    WHERE c.vin LIKE '%EA40%';
    

    But you are right: It doesn't look good with all the joins. Why is that? Because we select data FROM tables, and you only want data from the car table, and we tell the DBMS which rows we want in the WHERE clause. In order to look up other tables, we use EXISTS or IN:

    SELECT *
    FROM cars
    WHERE vin LIKE '%EA40%'
    AND id IN (SELECT car_id FROM options WHERE options_number = 415)
    AND id IN (SELECT car_id FROM options WHERE options_number = 362)
    AND id IN (SELECT car_id FROM options WHERE options_number = 502)
    AND id IN (SELECT car_id FROM options WHERE options_description = 'Bordcomputer')
    AND id IN (SELECT car_id FROM options WHERE options_number = 494)
    AND id IN (SELECT car_id FROM options WHERE options_number = 530);
    

    At last you have the option to aggregate the options per car and only get the car IDs that have all the options. E.g.:

    SELECT *
    FROM cars
    WHERE vin LIKE '%EA40%'
    AND id IN 
    (
      SELECT car_id
      FROM options
      GROUP BY car_id
      HAVING SUM(options_number = 415) > 0
         AND SUM(options_number = 362) > 0
         AND SUM(options_number = 502) > 0
         AND SUM(options_description = 'Bordcomputer') > 0
         AND SUM(options_number = 494) > 0
         AND SUM(options_number = 530) > 0
    );
    

    (This works, because in SQLite true = 1, false = 0, so if we add up the matches, we get 0 when there is no match. In other DBMS you'd need SUM(CASE WHEN options_number = 415 THEN 1 ELSE 0 END) or COUNT(CASE WHEN options_number = 415 THEN 1 END). The SQL standard offers the most readable option in my opinion, but only few DBMS support it yet: COUNT(*) FILTER(WHERE options_number = 415).)

    So if the option_number is unique, what I suppose it is, then the latter subquery becomes a mere

    (
      SELECT car_id
      FROM options
      GROUP BY car_id
      HAVING SUM(options_number IN (362, 415, 494, 502, 530)) = 5
         AND SUM(options_description = 'Bordcomputer') > 0
    );