Search code examples
mysqlsqlquery-optimization

High performance query to count many to many data


Requests Table:

requests
    +id (INT) AUTO_INCREMENT
    +title (VARCHAR)

Statuses Table:

statuses
    +id (INT) AUTO_INCREMENT
    +title (VARCHAR)

Relationship Table:

request_status (MANY TO MANY)
    +id (INT) AUTO_INCREMENT
    +request_id (INT) Foreign Key
    +status_id (INT)

I want to count only those requests whose current status_id is 2. Current Request's Status (status_id) is the last one in the request_status table.

What will be the perfect high-performance query if data is around 1600k


Solution

  • Assuming that the latest status is the one with greatest id:

    SELECT COUNT(*)
    FROM request_status
    WHERE status_id = 2
    AND NOT EXISTS (
        SELECT 1
        FROM request_status AS x
        WHERE request_id = request_status.request_id
        AND id > request_status.id
    )
    

    Or this:

    SELECT COUNT(*)
    FROM (
        SELECT 1
        FROM request_status
        GROUP BY request_id
        HAVING MAX(CASE WHEN status_id = 2 THEN id END) = MAX(id)
    ) AS x
    

    You will need some indexes. I suggest creating these ones:

    KEY ix1 (request_id, status_id)
    KEY ix2 (status_id,  request_id)