So I am trying to get some geographical data at the moment from the some 34 thousand records we have.
We stored a records of Person, which include the basics like firstname, surname, address and postcode.
I am wanting to get information about the postcode, and link it with their Longitue and Latitude co-ordinates. For this, i also have a table with australian postcodes, and their Long/Lat coords.
SELECT count(p.postcode) AS `count`, p.postcode, pctlt.Lat, pctlt.`Long`
FROM person AS p
INNER JOIN pcodetolonglat AS pctlt ON pctlt.Pcode = p.postcode
WHERE length(p.postcode)=4
GROUP BY p.postcode
currently, this script takes between one too two minutes to run,
Is there a more efficient way to get the same results?
This is your query:
SELECT count(p.postcode) AS `count`, p.postcode, pctlt.Lat, pctlt.`Long`
FROM person p INNER JOIN
pcodetolonglat pctlt
ON pctlt.Pcode = p.postcode
WHERE length(p.postcode) = 4
GROUP BY p.postcode;
You cannot do much with indexes on this query. An index on person(postcode)
should help. That is the first thing to try. The following is rewrite of your query that eliminates the outer aggregation, replacing it with a subquery:
SELECT pctlt.Pcode, pctlt.Lat, pctlt.`Long`,
(select count(*) from person p where pctlt.Pcode = p.postcode) as `count`
FROM pcodetolonglat pctlt
WHERE length(pctlt.Pcode) = 4 and
exists (select 1 from person p where pctlt.Pcode = p.postcode);
The exists
clause is to mimic the effect of the inner join
.
With an index on person(postcode)
this query may run faster than the original. I don't like replacing group by
with correlated subqueries, but in MySQL it sometimes has better performance.