I'm working with PostgreSQL (I'm a rookie in the database world) and I'd like to know your opinion on the efficiency of this kind of queries I found in the code I'm working with. These queries have a lot of JOINs, and one of them (bold font) has many rows by request. This forces us to GROUP BY request.id in order to obtain a row by request and a field (bold font) with all this rows data.
I think this kind of queries has to lose lots of time looking for all these maximums, but I can't figure an alternative way of doing this. Any ideas on its efficiency and how to improve it?
SELECT
request.id AS id,
max(request_type.name) AS request_type,
to_char(max(request.timestamp),'DD/mm/YYYY HH24:mi') AS timestamp,
to_char(max(request.timestamp),'YYYY-mm-DD') AS timestamp_filtering,
max(state.name) AS request_state,
max(users.name || ' ' || COALESCE(users.surname,'')) AS create_user,
max(request.id_create_user) AS id_create_user,
max(enterprise.name) AS enterprise,
max(cause_issue.name) AS cause,
max(request_movements.id_request_state) AS id_state,
array_to_string(array_agg(DISTINCT act_code.name || '/' || req_res.act_code), ', ') AS act_code, /* here */
max(revised.code) AS state_revised,
max(request_shipment.warehouse) AS warehouse,
max(req_res.id_warehouse) AS id_warehouse
FROM
request
LEFT JOIN users
ON users.id=request.id_create_user
LEFT JOIN enterprise
ON users.id_enterprise=enterprise.id
LEFT JOIN request_movements
ON request_movements.id=request.id_request_movement
LEFT JOIN request_versions
ON request_versions.id = request_movements.id_version
LEFT JOIN state
ON request_movements.id_request_state=state.id
INNER JOIN request_type
ON request.id_request_type=request_type.id
LEFT JOIN cause_issue
ON request.id_cause_issue=cause_issue.id
LEFT JOIN request_reserve req_res
ON req_res.id_request = request.id /* here */
LEFT JOIN act_code
ON req_res.id_act_code=act_code.id
LEFT JOIN request_shipment
ON (request_shipment.id_request=request.id)
LEFT JOIN warehouse_enterprise
ON (warehouse_enterprise.id = request_shipment.id_warehouse_enterprise)
LEFT JOIN revised
ON (revised.id = request_shipment.id_revised)
WHERE
request.id_request_type = "any_type"
GROUP BY
request.id
The EXPLAIN returns this.
You can much simplify this query by aggregating values in request_reserve
and act_code
before you JOIN
to the big join. This avoids the need for aggregate functions on all the other columns and should generally be much faster for a larger number of rows.
SELECT r.id
,rt.name AS request_type
,to_char(r.timestamp, 'DD/mm/YYYY HH24:mi') AS timestamp
,to_char(r.timestamp, 'YYYY-mm-DD') AS timestamp_filtering
,s.name AS request_state
,u.name || COALESCE(' ' || u.surname, '') AS create_user
,r.id_create_user
,e.name AS enterprise
,c.name AS cause
,rm.id_request_state AS id_state
,rr.act_code
,rd.code AS state_revised
,rs.warehouse
,rr.id_warehouse
FROM request r
LEFT JOIN users u ON u.id = r.id_create_user
LEFT JOIN enterprise e ON e.id = u.id_enterprise
LEFT JOIN request_movements rm ON rm.id = r.id_request_movement
LEFT JOIN request_versions rv ON rv.id = rm.id_version
LEFT JOIN state s ON s.id = rm.id_request_state
JOIN request_type rt ON rt.id = r.id_request_type
LEFT JOIN cause_issue c ON c.id = r.id_cause_issue
LEFT JOIN request_shipment rs ON rs.id_request = r.id
LEFT JOIN warehouse_enterprise w ON w.id = rs.id_warehouse_enterprise
LEFT JOIN revised rd ON rd.id = rs.id_revised
LEFT JOIN (
SELECT rr.id_request, rr.id_warehouse
,array_to_string(array_agg(
DISTINCT a.name || '/' || rr.act_code), ', ') AS act_code
FROM request_reserve rr
LEFT JOIN act_code a ON r.id_act_code = a.id
GROUP BY rr.id_request, rr.id_warehouse
) rr ON rr.id_request = r.id
WHERE r.id_request_type = "any_type"; -- use single quotes for values!
For big queries it is essential that you have a format the human eye can easily parse. Therefore I reformatted before I improved the query. I use table aliases to avoid unwieldy identifiers as much as possible.
Minor improvement to create_user
: no trailing space. If either part of the name can be NULL
, I suggest this to avoid a dangling space:
COALESCE(u.name || ' ' || u.surname, u.name, u.surname)
In PostgreSQL 9.1 or later you could use concat_ws()
.