Understanding Akka cluster sharding

I'm learning Akka sharding module. There is something I don't understand abour Sharding. Let's imagine you want to shard an actor : you have many entites from the same actor distribued on many nodes. Each entity can have its own state, which may differ from another entity.

A client is making a request (sending a message) to your shard actor to get back its status value. This is message is going to be processed by an entity and giving back its value as a result. But if it were treated by another entity the result would be different. But it should be the same because all entites derive from the same actor, shouldn'it ?

Solution

It seems you misunderstand the concept of Akka cluster sharding, let me explain with an example.

Let's say your service is responsible for responding with user profiles to requests. And to gain extremely low latency, you decide to use Akka actors to cache the user profiles in memory rather than having to query DB per request.

If your website only has 10 users and each user profile is just a few KB, you can hold all 10 user profiles in a single actor without issue, and you won't need cluster sharding for sure. However, if you have 10 million users, probably the 10 million user profiles won't fit into a single actor's memory, also it'd expensive if the actor goes down, as it means you need a large DB query to gain these data back from persistence.

In this scenario, cluster sharding is a fit. You will have 10 million Akka actors, distributed across your cluster, and each actor stores only 1 user profile. So GetUserProfile(userProfileId = 123) won't give you different response - it will always be routed to THE actor that holds user profile for the user 123, thus the response will always be the same.

How does the routing work? Check extractShardId and extractEntityId in the doc