Search code examples
mysqlcountinner-joinlarge-data-volumes

MySql search efficiency


So I am trying to get some geographical data at the moment from the some 34 thousand records we have.

We stored a records of Person, which include the basics like firstname, surname, address and postcode.

I am wanting to get information about the postcode, and link it with their Longitue and Latitude co-ordinates. For this, i also have a table with australian postcodes, and their Long/Lat coords.

SELECT count(p.postcode) AS `count`, p.postcode, pctlt.Lat, pctlt.`Long`
FROM person AS p
INNER JOIN pcodetolonglat AS pctlt ON pctlt.Pcode = p.postcode
WHERE length(p.postcode)=4
GROUP BY p.postcode

currently, this script takes between one too two minutes to run,

Is there a more efficient way to get the same results?


Solution

  • This is your query:

    SELECT count(p.postcode) AS `count`, p.postcode, pctlt.Lat, pctlt.`Long`
    FROM person p INNER JOIN
         pcodetolonglat pctlt
         ON pctlt.Pcode = p.postcode
    WHERE length(p.postcode) = 4
    GROUP BY p.postcode;
    

    You cannot do much with indexes on this query. An index on person(postcode) should help. That is the first thing to try. The following is rewrite of your query that eliminates the outer aggregation, replacing it with a subquery:

    SELECT pctlt.Pcode, pctlt.Lat, pctlt.`Long`,
           (select count(*) from person p where pctlt.Pcode = p.postcode) as `count`
    FROM pcodetolonglat pctlt 
    WHERE length(pctlt.Pcode) = 4 and
          exists (select 1 from person p where pctlt.Pcode = p.postcode);
    

    The exists clause is to mimic the effect of the inner join.

    With an index on person(postcode) this query may run faster than the original. I don't like replacing group by with correlated subqueries, but in MySQL it sometimes has better performance.