Search code examples
gemfirespring-data-gemfire

Gemfire LIKE Query with Indexes


We have a scenario to evaluate LIKE search for a key (with limit 100) which is indexed (range). The query uses the index, but the performance of the query varies based on the number of matches the 'key' returns.

i.e if the search is more specific, the query takes longer and if the search is generic (and has more results) it returns faster (may be because it got first 100 faster). The results are ranging from 1ms to 5 minutes for a replicated-region with 400k records. e.g query select * from /REGION where field like '%SEARCH_STRING%' Interestingly, the % in the beginning causes the issue. If we just remove that, it returns in milli-seconds. In either case the 'indexesUsed' is returning the correct index.

It looks like we're missing something fundamental with the indexing or there is a weird behavior with the indexing.

Note: Gemfire version: 8.2.0, Spring-data: 1.8.5. This issue is reproducible with query directly in gfsh too. So not related to spring-data layer.


Solution

  • The prefix % is causing you to scan the entire index. When you remove the prefix %, you are actually using the index to jump to the starting point (of the range index) and scan from there.

    Another thing I noticed in your query is that you are using "select *". Do you really need to serialize the entire record back? You will get better performance if you enable PDX and just select the fields that you need.

    You reported that it's using the index even though it's effectively doing a scan. If you don't have PDX enabled and it's really not using the index, deserializing all the records could be a problem (since you have select *) or the JVM is constrained for memory due to the scan.

    Wes Williams

    P.S. An effective way to also get help directly from the devs is to post the question on the geode user list.