I have a query like this:
SELECT DISTINCTROW
customers.id,
customers.customer_name
FROM customers
INNER JOIN rentals ON customers.id = rentals.id_customer
INNER JOIN rentals_billing ON rentals.id = rentals_billing.id_rental
INNER JOIN vehicles ON vehicles.id = rentals.id_vehicle
WHERE
(rentals_billing.customer_id_facsimile IS NULL AND rentals_billing.rental_payable = 1)
OR (
(rentals_billing.missing_rent_id_doc IS NULL OR missing_rent_id_doc <= 0)
AND missing_rent_fee > 0
)
AND customers.delete_date IS NULL
AND rentals.delete_date IS NULL
AND vehicles.delete_date IS NULL
AND (rentals.mode = 'MODEA' OR rentals.mode = 'MODEB' OR rentals.mode = 'MODEC')
Running on a MySQL server 5.6.13 (on Windows Server 2008 R2 SP1) the virtual server is setup with 4Gb of RAM and a 2.1GHz processor (I think single core).
At the present time:
customers
has around 3k records,rentals
has around 150k records,rentals_billing
has around 150k records,vehicles
has around 8k recordsThe query frequently goes in timeout (tooks more than 10-20 seconds).
I tried indexing some of the tables and keys involved in the query but without much luck. There's a way to optimize such query ?
Best regards, Mike
First of all, AND
has precedence over OR
, so you have
WHERE
(customer_id_facsimile IS NULL AND rentals_billing.rental_payable = 1)
OR
(
(missing_rent_id_doc IS NULL OR missing_rent_id_doc <= 0) AND missing_rent_fee > 0)
AND customers.delete_date IS NULL
AND ...
)
While this may be what you want, I guess you'll rather want
WHERE
(
(customer_id_facsimile IS NULL AND rentals_billing.rental_payable = 1)
OR
((missing_rent_id_doc IS NULL OR missing_rent_id_doc <= 0) AND missing_rent_fee > 0)
)
AND customers.delete_date IS NULL
AND ...
Then DISTINCTROW
(or more commonly just DISTINCT
) is very often a sign for a poorly written query where someone joins several tables, produces loads of duplicate rows and then must get rid of the duplicates they created themselves. Such unnecessary big intermediate results can slow down a query a lot. And this is what you are doing here.
You want to select from customers, but only those that match certain criteria. So why the joins? Select FROM customers
and filter in WHERE
.
SELECT *
FROM customers c
WHERE c.delete_date IS NULL
AND EXISTS
(
SELECT NULL
FROM rentals r
JOIN rentals_billing rb on rb.id_rental = r.id
JOIN vehicles v on v.id = r.id_vehicle AND v.delete_date IS NULL
WHERE r.id_customer = c.id
AND r.delete_date IS NULL
AND r.mode IN ('MODEA', 'MODEB', 'MODEC')
AND
(
(???.customer_id_facsimile IS NULL AND rb.rental_payable = 1)
or
((rb.missing_rent_id_doc IS NULL OR rb.missing_rent_id_doc <= 0) AND ???.missing_rent_fee > 0)
)
);
I've put ???
where I don't know which table the column belongs to.
To speed up the lookup you should have this index:
CREATE INDEX idx ON rentals (id_customer, mode, delete_date, id_vehicle);
Or, if most rentals are deleted:
CREATE INDEX idx ON rentals (id_customer, delete_date, mode, id_vehicle);
Well, important is that id_customer comes first, because this is the column for which the lookup shall be made. These indexes are called covering indexes, because they contain all the columns you are using in the query. The first three are for the quick lookup, the id_vehicle then is for the join to vehicles, because you want to ensure that the rental is on a non-deleted vehicle. The rental ID that you need for the join to rentals_billing is silently included in the indexes in MySQL. In other DBMS you would have to explicitely add it.