Search code examples
mysqlsqlsqlperformance

How to rewrite ON clause to avoid using OR


I have a very inefficient query, due to using an OR as part of the ON statement in a JOIN.

SELECT Meetings.SomeFields, Organisations.SomeFields,
FROM Meetings
     LEFT JOIN Users
          ON (Meetings.AdministratorID = Users.UserID
              OR Meetings.ChairpersonID = Users.UserID)
     LEFT JOIN Organisations
          ON Organisations.OrganisationID = Users.OrganisationID;

(Note: This is a slight simplification to demonstrate the problem. In practice, this is a complex query with about 20 tables, but I have reduced the performance issue down to this small subset of the query.)

This attempts to implement the following logic for joining to an organisation: If there is a chairperson, then the chairperson's organisation should be used. Otherwise, the administrator's organisation should be used. Either or both of these fields will be filled in, and a subsequent WHERE clause ensures that no extraneous records (e.g. where both are filled in) are included in the outupt.

Using EXPLAIN I can see that the OR operator means that indexes can't be used and all row combinations need to be checked individually. With ~100k Meeting records and ~70k User records, this has a huge query performance issue as this means ~7 billion combinations need to be checked.

Removing the OR Meetings.ChairpersonID = Users.UserID part of the expression results in an instant result as the query can use the indexes properly, but obviously doesn't give the result we need.

Can anyone suggest a way of rewriting this query to avoid the performance issues?

I am using MySQL 5.


Solution

  • If there is a chairperson, then the chairperson's organisation should be used. Otherwise, the administrator's organisation should be used.

    "If... otherwise..." is not OR, it's implication, which you can compactly express via COALESCE:

    SELECT Meetings.SomeFields, Organisations.SomeFields,
    FROM Meetings
         LEFT JOIN Users
              ON COALESCE(Meetings.ChairpersonID, Meetings.AdministratorID) = Users.UserID
         LEFT JOIN Organisations
              ON Organisations.OrganisationID = Users.OrganisationID;
    

    This way, for each Meetings row, the corresponding Users row can be found by seeking the Users PK.


    One more thing... if ChairpersonID and AdministratorID cannot both be NULL, then you can inner-join to Users.