Search code examples
mysqlfunctioncachingmariadbdeterministic

How to clear MySQL deterministic function result cache?


I have a huge Geo database that I frequently need to compare with real time Geo data in order to determine nearest location per Latitude:Longitude. The location table does hold a number of rows, but is rarely added with new records. Determining nearest location against millions of real time data is painfully costing us with super slow queries, even after implementing a rectangular distance comparison algorithm (than actually comparing by Haversine).

I want to convert this comparison with a DETERMINISTIC function that should really bring up a real performance boost with static results.

However, I want MySQL to reset/rebuild this deterministic result cache every week. Like, I want MySQL to return me the same result for a Latitude:Longitude pair comparing against my location table, but for 7 days. After 7 days, there is a good chance I might add new locations to that table, and I want MySQL to start rebuilding that deterministic function result cache considering new rows been added to that table, preferably without restarting MySQL server.

Note: A MariaDB compliant solution is a serious good to have :)

Correction: Please forgive me to use that term with MySQL. So far I could understand, the result does not change for a deterministic function where all input are the same, this allures me to think MySQL does not actually tend to execute or process the instruction inside the function, rather returns the previously calculated value for the same set of input values, so, definitely it does cache the values somewhere (I don't know where), thus behaving something like just looking up through a list or something like that. I think I overloaded the OPTIMIZER with CACHE here :(

======== EDIT FOR TECHNICAL CLARIFICATION ========

Table: data (around 4.5 Bl records)
ID BIGINT(20) PK AI
Terminal BIGINT(20) NOT NULL <= Foreign key
Latitude FLOAT (8, 5) NULL (indexed)
Longitude FLOAT (8, 5) NULL (indexed)
Location <= Foreign key

Table: location (around 10k records)
ID BIGINT(20) PK AI
Name VARCHAR(250) NOT NULL UNIQUE
Latitude FLOAT (8, 5) NOT NULL (indexed)
Longitude FLOAT (8, 5) NOT NULL (indexed)

Incoming data for 'data' table is real time and about 1500 data per second (we have a process that iterates indefinitely per second). Around 85% of data contain coordinates and we are trying to determine the nearest location real time as the stream is captured in real time.


Solution

  • I think you misunderstand what DETERMINISTIC means as an option to CREATE FUNCTION.

    It does not mean the result of the function is memoized. There is no cache of function results. There is no command to refresh this cache, because the results are not kept.

    The meaning of DETERMINISTIC mainly affects binary logging:

    https://mariadb.com/kb/en/create-function/#not-deterministic

    The [NOT] DETERMINISTIC clause also affects binary logging, because the STATEMENT format can not be used to store or replicate non-deterministic statements.

    That is, a non-deterministic function may return different results if it is executed on a replica, so if you use the function in an SQL statement that modifies data, the binary log format for that event must be ROW to ensure the same change is applied on the replica.

    There's also a vague reference:

    The optimizer may choose a faster execution plan if it known that the function is deterministic.

    But no example is given for such a case. This is likely to be a rarity for it to make a significant performance difference.

    I don't think this will be an effective performance optimization for your use case.