Search code examples
sqlsqlitequery-optimization

SQL IN in WHERE. How smart is the optimizer?


I have the following query to execute:

UPDATE scenario_group 
    SET 
        project_id = @projectId 
    WHERE
        scenario_group_id = @scenarioGroupId 
        AND @projectId IN (SELECT project_id FROM project);";

To me that looks like it would evaluate the @projectId IN (SELECT project_id FROM PROJECT) for every matching row. That would be bad. On the other hand, if the optimizer is smart it would see that the projects table never changes and therefore only do that check once.

This is the EXPLAIN. That looks like it would be doing the IN a lot. But I'm not sure how to interpret it.

id  parent  notused detail
3   0   0   SEARCH TABLE scenario_group USING INTEGER PRIMARY KEY (rowid=?)
8   0   0   USING ROWID SEARCH ON TABLE project FOR IN-OPERATOR

Do I need to rewrite this update? If so, how would I write this UPDATE to explicitly check for the existence of the project_id in advance?


Solution

  • EXISTS is often more efficient than IN. So I would just write the logic as:

    UPDATE scenario_group 
        SET project_id = @projectId 
        WHERE scenario_group_id = @scenarioGroupId AND
              EXISTS (SELECT 1 FROM project p WHERE p.project_id = @projectId);
    

    In particular, this will make use of an index on project(project_id) -- which seems quite likely given that project_id should be the primary key.