Search code examples
mysqlmysql-8.0spatial-queryspatial-index

MySQL 8.0.32 group by POINT() column with unused index slow compared to 5.7


We recently updated from MySQL 5.7 to 8.0.32 and had to rework a database POINT column that previously was very fast with a prefix-based (??) index, which I believe is not supported in MySQL 8.0.x.

I altered the column to be an "SRID 4326"-only column, and gave it a spatial index.

With MySQL 5.7 I never looked into whether the B-Tree index was being used or not, because the queries were all reasonably fast.

Now, running an equivalent query that used to be "fast" is taking ~18 seconds or so. My query is in the format:

SELECT
  ST_Latitude(granularity_30) as lat,
  ST_Longitude(granularity_30) as lng
FROM
  `GeoPriceCache`
WHERE
  MBRContains(ST_GeomFromText('Polygon((
    -64.36807421875 56.379500183529,
    -64.36807421875 18.086774255995,
    -132.79092578125 18.086774255995,
    -132.79092578125 56.379500183529,
    -64.36807421875 56.379500183529
    ))', 4326, 'axis-order=long-lat'), granularity_30)
GROUP BY granularity_30;

The table contains several hundred thousand records, and this should collect most of them, put into ~250 rows after grouping.

I'm not sure if I've unknowingly changed something about my system and this is fully expected, or if I'm missing something.

I'm puzzled why, if running EXPLAIN on the above query, it identifies my granularity_30 column as having a "possible_key", but does not use it.

If I reverse MBRContains() so that the point comes first and the Polygon() comes second, it claims it WILL use the key, but of course, none of my records' single key contains that polygon, so... I can't do that. :)

Edit: DDL for table:

CREATE TABLE `GeoPriceCache` (
  `item_id` bigint unsigned NOT NULL,
  `granularity_30` point NOT NULL /*!80003 SRID 4326 */ COMMENT 'This item''s cluster point when viewing at a moderate zoom level.',
  `granularity_40` point NOT NULL /*!80003 SRID 4326 */ COMMENT 'This item''s cluster point when zoomed in to a high zoom level.',
  `granularity_50` point NOT NULL /*!80003 SRID 4326 */ COMMENT 'This item\\s cluster point when zoomed in to a very high zoom level.',
  UNIQUE KEY `geopricecache_item_id_unique` (`item_id`),
  SPATIAL KEY `geopricecache_granularity_50_spatialindex` (`granularity_50`),
  SPATIAL KEY `geopricecache_granularity_40_spatialindex` (`granularity_40`),
  SPATIAL KEY `geopricecache_granularity_30_spatialindex` (`granularity_30`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_bin

If I use FORCE INDEX (geopricecache_granularity_30_spatialindex) it DOES indeed use the index, but is not faster. (Actually, it might be maybe 1-2 seconds faster? Maybe just an anomaly?)


Solution

  • I ended up asking a [thought to be unrelated] question in dba.stackexchange.com, the solution for which wound up explaining this as well.

    The gist is this:

    It would appear that an index (...at least, this spatial index.) isn't inherently populated (or otherwise cannot be used for some reason), if there are not sufficient resources in the server to do so.

    I modified the following three global variables:

    • max_heap_table_size was set to 64M
    • tmp_table_size was set to 64M
    • innodb_buffer_pool_size was set to 4G (at least for now in my local Docker MySQL server)

    And at the end, the index could (and was) used!