Search code examples
elasticsearchelasticsearch-aggregationelasticsearch-dslelasticsearch-query

Elasticsearch terms aggregation and lowercase values


I am using the following search query to populate an autocomplete dropdown with values based on what the user types.

{
    _source: 'event',
    query: {
        simple_query_string: {
            query: ''+term+'*', // converts to string; adds * to match prefix
            fields: ['event'] 
        }
    },
    size:0,
    track_total_hits: false,
    aggs: {
        filterValues: {
            composite: {
                size: 100,
                sources: [
                    { "filterValue": { "terms": { "field": 'event', "missing_bucket": true } } }
                ],
                after: { 'event': after }
            },
        }
    }
}

Field value used for indexing: UYB 4.9.0 AJF 5 Qnihsbm.

Currently if the user types the first letter u or U, Elasticsearch will return the above value in lowercase uyb 4.9.0 ajf 5 qnihsbm. How can I maintain this behaviour but return the value exactly as it was indexed? i.e UYB 4.9.0 AJF 5 Qnihsbm

Field mapping

"mappings": {
    "properties": {
        "event": {
            "type": "keyword",
            "normalizer": "normalizer_1"
        },
        .....
    }
}

ES Config

"settings": {
    "analysis": {
        "normalizer": {
            "normalizer_1": {
                "type": "custom",
                "char_filter": [],
                "filter": ["lowercase", "asciifolding"]
            }
        }
    }
},

Solution

  • You should have another field in your mapping that is not lowercased and that's the one you search on.

    "mappings": {
        "properties": {
            "event": {
                "type": "keyword",
                "fields": {
                    "search": {
                        "type": "keyword",
                        "normalizer": "normalizer_1",
                    }
                }
            },
            .....
        }
    }
    

    Your query would then need to run on event.search instead of `event``

        simple_query_string: {
            query: ''+term+'*', // converts to string; adds * to match prefix
            fields: ['event.search'] 
        }                      ^
                               |
                           add this
    

    All the rest can stay the same.