Search code examples
elasticsearchelasticsearch-6

Elasticsearch 6.0 Removal of mapping types - Alternatives


Background

I migrating my ES index into ES version 6. I currenly stuck because ES6 removed the using on "_type" field.

Old Implementation (ES2)

My software has many users (>100K). Each user has at least one document in ES. So, the hierarchy looks like this:

INDEX  ->  TYPE      -> Document
myindex->  user-123  -> document-1

The key point here is with this structure I can easily remove all the document of specific user.

DELETE /myindex/user-123

(Delete all the document of specific user, with a single command)

The problem

"_type" is no longer supported by ES6.

Possible solution

Instead of using _type, use the index name as USER-ID. So my index will looks like:

"user-123" -> "static-name" -> document

Delete user is done by delete index (instead of delete type in previous implementation).

Questions:

  • My first worry is about the amount of index and performance: Having like 1M indexes is something that acceptable in terms of performance? don't forget I have to search on them frequently.
  • Most of my users has small amount of documents stored in ES. Is that make sense to hold a shard, which should be expensive, for < 10 documents?
  • My data architecture sounds reasonable for you?

Any other tip will be welcome! Thanks.


Solution

  • I would not have one index per user, it's a waste of resources, especially if there are only 10 docs per user.

    What I would do instead is to use filtered aliases, one per user.

    So the index would be named users and the type would be a static name, e.g. doc. For user 123, the documents of that user would all be stored in users/doc/xyz and in each document you need to add the user id, e.g.

    PUT users/doc/xyz
    {
       ...
       "userId": 123,
       ...
    }
    

    Then you can define a filtered alias for all documents of user 123, like this:

    POST /_aliases
    {
        "actions" : [
            {
                "add" : {
                     "index" : "users",
                     "alias" : "user-123",
                     "filter" : { "term" : { "userId" : "123" } }
                }
            }
        ]
    }
    

    If you need to delete all documents of user 123, then you can simply do it like this:

    POST user-123/_delete_by_query?q=*