Search code examples
elasticsearchmasking

How to do a masked query in Elasticsearch?


For example, I am storing user passports in Elasticsearch. They are stored as consecutive letters and digits of the following format: AADDDDDDD. 2 alphabet and then 7 digits.

User is interested in search where he could mention specific values for specific positions. For example, I want to search all the passport numbers who have 'A' at the beginning, '7' in the third position and '0' in the last position. Something like this:

A-7----0

How to generate an efficient query for this? Do I need to create any custom analyzer for this?

So far what I've done is I inserted space in between characters and then searching for index position, seems like a costly operation to me.


Solution

  • How much efficient query do you need? If your data is not very big you can try regexp query https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html

    Other suggest will be use document with array of symbols and their places. ex.

       {
            'code' : [
               {'pos':1, 'symbol':A},{'pos':2, 'symbol':B}, ...
            ]
        } 
    

    then you can use a bool filter, and efficiently use filter cache