Search code examples
elasticsearchlowercase

Elasticsearch not analyzed and lowercase


I'm trying to make a field lowercase and not analyzed in Elasticsearch 5+ in order to search for strings with spaces in lowercase (them being indexed in mixed case)
Before Elasticsearch v5 we could use an analyzer like this one to accomplish it:

  "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "analyzer_keyword":{
                 "tokenizer":"keyword",
                 "filter":"lowercase"
              }
           }
        }
     }
  }

This however doesn't work for me right now. And I believe the problem to be that "string" is deprecated and automatically converted to either keyword or text.
Anyone here know how to accomplish this? I thought about adding a "fields" tag to my mapping along the lines of:

  "fields": {
    "lowercase": {
      "type": "string"
       **somehow convert to lowercase**
    }
  }

This would make working with it slightly more challenging and I have no idea how to convert it to lowercase either.

Below you'll find a test setup which reproduces my exact problem.

create index:

{
  "settings":{
     "index":{
        "analysis":{
           "analyzer":{
              "analyzer_keyword":{
                 "tokenizer":"keyword",
                 "filter":"lowercase"
              }
           }
        }
     }
  },
  "mappings":{
     "test":{
        "properties":{
           "name":{
              "analyzer":"analyzer_keyword",
              "type":"string"
           }
        }
     }
  }
}

Add a test record:

 {
    "name": "city test"
  }

Query that should match:

{
    "size": 20,
    "from": 0,
    "query": {
        "bool": {
            "must": [{
                "bool": {
                    "should": [{
                        "wildcard": {
                            "name": "*city t*"
                        }
                    }]
                }
            }]
        }
    }
}

Solution

  • When creating your index, you need to make sure that the analysis section is right under the settings section and not inside the settings > index section otherwise it won't work.

    Then you also need to use the text data type for your field instead of the string one. Wipe your index, do that and it will work.

    {
      "settings":{
            "analysis":{
               "analyzer":{
                  "analyzer_keyword":{
                     "tokenizer":"keyword",
                     "filter":"lowercase"
                  }
               }
            }
      },
      "mappings":{
         "test":{
            "properties":{
               "name":{
                  "analyzer": "analyzer_keyword",
                  "type": "text"
               }
            }
         }
      }
    }