diff --git a/akka-docs/src/main/paradox/cluster-sharding.md b/akka-docs/src/main/paradox/cluster-sharding.md index f0840c96a0..222d439df6 100644 --- a/akka-docs/src/main/paradox/cluster-sharding.md +++ b/akka-docs/src/main/paradox/cluster-sharding.md @@ -132,7 +132,7 @@ tutorial named [Akka Cluster Sharding with Scala!](https://github.com/typesafehu The `ShardRegion` actor is started on each node in the cluster, or group of nodes tagged with a specific role. The `ShardRegion` is created with two application specific functions to extract the entity identifier and the shard identifier from incoming messages. -A shard is a group of entities that will be managed together. For the first message in a +A `Shard` is a group of entities that will be managed together. For the first message in a specific shard the `ShardRegion` requests the location of the shard from a central coordinator, the `ShardCoordinator`. @@ -148,22 +148,38 @@ shard incoming messages for that shard are buffered and later delivered when the shard home is known. Subsequent messages to the resolved shard can be delivered to the target destination immediately without involving the `ShardCoordinator`. -Scenario 1: +### Scenarios - 1. Incoming message M1 to `ShardRegion` instance R1. - 2. M1 is mapped to shard S1. R1 doesn't know about S1, so it asks the coordinator C for the location of S1. - 3. C answers that the home of S1 is R1. - 4. R1 creates child actor for the entity E1 and sends buffered messages for S1 to E1 child - 5. All incoming messages for S1 which arrive at R1 can be handled by R1 without C. It creates entity children as needed, and forwards messages to them. +Once a `Shard` location is known `ShardRegion`s send messages directly. Here are the +scenarios for getting to this state. In the scenarios the following notation is used: -Scenario 2: +* `SC` - ShardCoordinator +* `M#` - Message 1, 2, 3, etc +* `SR#` - ShardRegion 1, 2 3, etc +* `S#` - Shard 1 2 3, etc +* `E#` - Entity 1 2 3, etc. An entity refers to an Actor managed by Cluster Sharding. - 1. Incoming message M2 to R1. - 2. M2 is mapped to S2. R1 doesn't know about S2, so it asks C for the location of S2. - 3. C answers that the home of S2 is R2. - 4. R1 sends buffered messages for S2 to R2 - 5. All incoming messages for S2 which arrive at R1 can be handled by R1 without C. It forwards messages to R2. - 6. R2 receives message for S2, ask C, which answers that the home of S2 is R2, and we are in Scenario 1 (but for R2). +Where `#` is a number to distinguish between instances as there are multiple in the Cluster. + +#### Scenario 1: Message to an unknown shard that belongs to the local ShardRegion + + 1. Incoming message `M1` to `ShardRegion` instance `SR1`. + 2. `M1` is mapped to shard `S1`. `SR1` doesn't know about `S1`, so it asks the `SC` for the location of `S1`. + 3. `SC` answers that the home of `S1` is `SR1`. + 4. `R1` creates child actor for the entity `E1` and sends buffered messages for `S1` to `E1` child + 5. All incoming messages for `S1` which arrive at `R1` can be handled by `R1` without `SC`. It creates entity children as needed, and forwards messages to them. + +#### Scenario 2: Message to an unknown shard that belongs to a remote ShardRegion + + 1. Incoming message `M2` to `ShardRegion` instance `SR1`. + 2. `M2` is mapped to `S2`. SR1 doesn't know about `S2`, so it asks `SC` for the location of `S2`. + 3. `SC` answers that the home of `S2` is `SR2`. + 4. `SR1` sends buffered messages for `S2` to `SR2`. + 5. All incoming messages for `S2` which arrive at `SR1` can be handled by `SR1` without `SC`. It forwards messages to `SR2`. + 6. `SR2` receives message for `S2`, ask `SC`, which answers that the home of `S2` is `SR2`, and we are in Scenario 1 (but for `SR2`). + + +### Shard location To make sure that at most one instance of a specific entity actor is running somewhere in the cluster it is important that all nodes have the same view of where the shards @@ -177,6 +193,8 @@ allocation strategy. The default implementation `ShardCoordinator.LeastShardAllo allocates new shards to the `ShardRegion` with least number of previously allocated shards. This strategy can be replaced by an application specific implementation. +### Shard Rebalancing + To be able to use newly added members in the cluster the coordinator facilitates rebalancing of shards, i.e. migrate entities from one node to another. In the rebalance process the coordinator first notifies all `ShardRegion` actors that a handoff for a shard has started. @@ -202,6 +220,8 @@ i.e. new members in the cluster. There is a configurable threshold of how large must be to begin the rebalancing. This strategy can be replaced by an application specific implementation. +### Shard Coordinator State + The state of shard locations in the `ShardCoordinator` is persistent (durable) with @ref:[Distributed Data](distributed-data.md) or @ref:[Persistence](persistence.md) to survive failures. When a crashed or unreachable coordinator node has been removed (via down) from the cluster a new `ShardCoordinator` singleton @@ -209,16 +229,21 @@ actor will take over and the state is recovered. During such a failure period sh with known location are still available, while messages for new (unknown) shards are buffered until the new `ShardCoordinator` becomes available. +### Message ordering + As long as a sender uses the same `ShardRegion` actor to deliver messages to an entity actor the order of the messages is preserved. As long as the buffer limit is not reached messages are delivered on a best effort basis, with at-most once delivery semantics, in the same way as ordinary message sending. Reliable end-to-end messaging, with at-least-once semantics can be added by using `AtLeastOnceDelivery` in @ref:[Persistence](persistence.md). +### Overhead + Some additional latency is introduced for messages targeted to new or previously unused shards due to the round-trip to the coordinator. Rebalancing of shards may also add latency. This should be considered when designing the application specific -shard resolution, e.g. to avoid too fine grained shards. +shard resolution, e.g. to avoid too fine grained shards. Once a shard's location is known +the only overhead is sending a message via the `ShardRegion` rather than directly. ## Distributed Data vs. Persistence Mode