Search code examples
amazon-web-serviceselasticsearchroutes

How does Elasticsearch determine shard selection when both a routing key and custom ID are provided for data storage?


In Elasticsearch, when both a routing key and a custom ID are provided while storing data, how does the Elasticsearch decide which shard to choose for data storage? I'm curious about the underlying logic and considerations that Elasticsearch employs to determine the shard selection in this scenario. Does the routing key influence the shard selection process alongside the custom document ID? or just routing key will be used to decide the shard.


Solution

  • If you don't provide a routing value, then the ID you provide is used to create one dynamically, so that the target shard can be identified in a deterministic way.

    If you do provide a routing value, then that's the value that will be taken into account to determine the target shard on which to index the document.

    You can see in the source code below that if the routing value is not provided then the id is used in its place

    protected int shardId(String id, @Nullable String routing) {
        return hashToShardId(effectiveRoutingToHash(routing == null ? id : routing));
    }
    

    That's pretty much it, there's nothing more to read into this.