Merge pull request #25405 from chbatey/cluster-sharding-docs

Break up cluster sharding docs into sections
This commit is contained in:
Patrik Nordwall 2018-08-21 20:37:12 +02:00 committed by GitHub
commit 3fcfb8d699
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -132,7 +132,7 @@ tutorial named [Akka Cluster Sharding with Scala!](https://github.com/typesafehu
The `ShardRegion` actor is started on each node in the cluster, or group of nodes
tagged with a specific role. The `ShardRegion` is created with two application specific
functions to extract the entity identifier and the shard identifier from incoming messages.
A shard is a group of entities that will be managed together. For the first message in a
A `Shard` is a group of entities that will be managed together. For the first message in a
specific shard the `ShardRegion` requests the location of the shard from a central coordinator,
the `ShardCoordinator`.
@ -148,22 +148,38 @@ shard incoming messages for that shard are buffered and later delivered when the
shard home is known. Subsequent messages to the resolved shard can be delivered
to the target destination immediately without involving the `ShardCoordinator`.
Scenario 1:
### Scenarios
1. Incoming message M1 to `ShardRegion` instance R1.
2. M1 is mapped to shard S1. R1 doesn't know about S1, so it asks the coordinator C for the location of S1.
3. C answers that the home of S1 is R1.
4. R1 creates child actor for the entity E1 and sends buffered messages for S1 to E1 child
5. All incoming messages for S1 which arrive at R1 can be handled by R1 without C. It creates entity children as needed, and forwards messages to them.
Once a `Shard` location is known `ShardRegion`s send messages directly. Here are the
scenarios for getting to this state. In the scenarios the following notation is used:
Scenario 2:
* `SC` - ShardCoordinator
* `M#` - Message 1, 2, 3, etc
* `SR#` - ShardRegion 1, 2 3, etc
* `S#` - Shard 1 2 3, etc
* `E#` - Entity 1 2 3, etc. An entity refers to an Actor managed by Cluster Sharding.
1. Incoming message M2 to R1.
2. M2 is mapped to S2. R1 doesn't know about S2, so it asks C for the location of S2.
3. C answers that the home of S2 is R2.
4. R1 sends buffered messages for S2 to R2
5. All incoming messages for S2 which arrive at R1 can be handled by R1 without C. It forwards messages to R2.
6. R2 receives message for S2, ask C, which answers that the home of S2 is R2, and we are in Scenario 1 (but for R2).
Where `#` is a number to distinguish between instances as there are multiple in the Cluster.
#### Scenario 1: Message to an unknown shard that belongs to the local ShardRegion
1. Incoming message `M1` to `ShardRegion` instance `SR1`.
2. `M1` is mapped to shard `S1`. `SR1` doesn't know about `S1`, so it asks the `SC` for the location of `S1`.
3. `SC` answers that the home of `S1` is `SR1`.
4. `R1` creates child actor for the entity `E1` and sends buffered messages for `S1` to `E1` child
5. All incoming messages for `S1` which arrive at `R1` can be handled by `R1` without `SC`. It creates entity children as needed, and forwards messages to them.
#### Scenario 2: Message to an unknown shard that belongs to a remote ShardRegion
1. Incoming message `M2` to `ShardRegion` instance `SR1`.
2. `M2` is mapped to `S2`. SR1 doesn't know about `S2`, so it asks `SC` for the location of `S2`.
3. `SC` answers that the home of `S2` is `SR2`.
4. `SR1` sends buffered messages for `S2` to `SR2`.
5. All incoming messages for `S2` which arrive at `SR1` can be handled by `SR1` without `SC`. It forwards messages to `SR2`.
6. `SR2` receives message for `S2`, ask `SC`, which answers that the home of `S2` is `SR2`, and we are in Scenario 1 (but for `SR2`).
### Shard location
To make sure that at most one instance of a specific entity actor is running somewhere
in the cluster it is important that all nodes have the same view of where the shards
@ -177,6 +193,8 @@ allocation strategy. The default implementation `ShardCoordinator.LeastShardAllo
allocates new shards to the `ShardRegion` with least number of previously allocated shards.
This strategy can be replaced by an application specific implementation.
### Shard Rebalancing
To be able to use newly added members in the cluster the coordinator facilitates rebalancing
of shards, i.e. migrate entities from one node to another. In the rebalance process the
coordinator first notifies all `ShardRegion` actors that a handoff for a shard has started.
@ -202,6 +220,8 @@ i.e. new members in the cluster. There is a configurable threshold of how large
must be to begin the rebalancing. This strategy can be replaced by an application specific
implementation.
### Shard Coordinator State
The state of shard locations in the `ShardCoordinator` is persistent (durable) with
@ref:[Distributed Data](distributed-data.md) or @ref:[Persistence](persistence.md) to survive failures. When a crashed or
unreachable coordinator node has been removed (via down) from the cluster a new `ShardCoordinator` singleton
@ -209,16 +229,21 @@ actor will take over and the state is recovered. During such a failure period sh
with known location are still available, while messages for new (unknown) shards
are buffered until the new `ShardCoordinator` becomes available.
### Message ordering
As long as a sender uses the same `ShardRegion` actor to deliver messages to an entity
actor the order of the messages is preserved. As long as the buffer limit is not reached
messages are delivered on a best effort basis, with at-most once delivery semantics,
in the same way as ordinary message sending. Reliable end-to-end messaging, with
at-least-once semantics can be added by using `AtLeastOnceDelivery` in @ref:[Persistence](persistence.md).
### Overhead
Some additional latency is introduced for messages targeted to new or previously
unused shards due to the round-trip to the coordinator. Rebalancing of shards may
also add latency. This should be considered when designing the application specific
shard resolution, e.g. to avoid too fine grained shards.
shard resolution, e.g. to avoid too fine grained shards. Once a shard's location is known
the only overhead is sending a message via the `ShardRegion` rather than directly.
<a id="cluster-sharding-mode"></a>
## Distributed Data vs. Persistence Mode