Search code examples
elasticsearchsymfonyongr

Substring search with ONGR Elastic Bundle for Symfony


I'm using the https://github.com/ongr-io/ElasticsearchBundle for my Symfony3 project. The reason for this bundle is, that my project is using propel.

Till now everything is fine, it's working quite well. But now I want to add the possibility to search for a substring of a word. e.g. there are items named like Test01, Test02, Test03, ... and when I try to search for Test for example I don't get any results. Just when I type the whole word like Test01.

I've read about the possibility of wildcard searches, but different solutions said, that using ngram or edge_ngram would be a better solution.

I've tried to specify it in the configuration as follows

ongr_elasticsearch:
    analysis:
      filter:
        incremental_filter:
          type: edge_ngram
          min_gram: 3
          max_gram: 10
      analyzer:
        incrementalAnalyzer:
          type: custom
          tokenizer: standard
          filter:
              - lowercase
              - incremental_filter
    managers:
      default:
          index:
            hosts:
                - %elastic_host%:%elastic_port%
            index_name: index
            analysis:
              analyzer:
                  - incrementalAnalyzer
              filter:
                  - incremental_filter
          mappings:
              - AppBundle

But I didn't get the result as wanted. Can anyone help me with that? What are the differences between filters and analyzers? I'm using a MultiMatchQuery as I want to search in different fields of different types:

$multiMatchQuery =
 new MultiMatchQuery(
                [
                    'name^12',
                    'product_name^8',
                    'itemno^18',
                    'number^7',
                    'category^6',
                    'company^4',
                    'motor^3',
                    'chassis^13',
                    'engine^14',
                    'description'
                ],
                $term
            );
            $search->addQuery($multiMatchQuery);

I also tried to define "not_analyzed" fields.

Hope for your help!

Thanks.


Solution

  • Okay, I found the solution. Here is an article which describes the problem (specific in German language) https://www.elastic.co/guide/en/elasticsearch/guide/current/ngrams-compound-words.html

    So the analyzer needs a ngram filter (didn't worked with tokenizer). I forgot also to specify the property with the analyzer. now it worked.

    ongr_elasticsearch:
        analysis:
          analyzer:
            my_ngram_analyzer:
              type: custom
              tokenizer: standard
              filter:
                - lowercase
                - my_ngram_filter
          filter:
            my_ngram_filter:
              type: ngram
              min_gram: 2
              max_gram: 8
        managers:
          default:
              index:
                hosts:
                    - %elastic_host%:%elastic_port%
                index_name: index
                analysis:
                  analyzer:
                      - my_ngram_analyzer
                  filter:
                      - my_ngram_filter
              mappings:
                  - AppBundle
    

    And a property in the Document need to be defined properly as well (for all needed properties).

        /**
         * @var string
         *
         * @ES\Property(name="itemno", type="string", options={"analyzer":"my_ngram_analyzer"})
         */
        public $itemno;