I have a question regarding the tradeoffs/performance considerations to keep in mind while mapping string fields as both text
and keyword
vs just one of those.
I have a use-case where mapping around 25-30 string fields as both text and keyword would be a nice to have but if there were some serious performance considerations, then I would drill down and map each of them only to the type they will be searched most as.
I have not been able to find much information online about this. Hence asking here.
ElasticSearch Version 7.10 Thanks!
The default mappings provided by ES which map a field as both text
and keyword
usually do that because it's convenient and that will allow the field to be used in different contexts without having to think too hard about it. It's also a good way of bootstrapping new projects and not worry too much about that aspect until later in the project.
However, if you're truly serious about your mappings and the performance of your cluster, you should always give as much thought as possible as to why you map a field in certain way.
There are a few basic rules (but your mileage may always vary) in the following (non-exhaustive) list:
keyword
only (and/or wildcard
depending on your search use cases).text
.text
as there is a non-negligible overhead related to indexing text fields during the analysis process.As said, obviously the above list is non-exhaustive, but it gives you some pointers. The bottom line is that you need to think hard about your data and what you want to do with it. Once you know the use cases you need to support, you'll know how to map your fields. I would never accept to let a default text/keyword mapping if there's no reason to do it.