Documentation for Sharding rolling update (#29666)
This commit is contained in:
parent
2caa560aab
commit
90b79144e5
6 changed files with 36 additions and 32 deletions
|
|
@ -202,19 +202,6 @@ object ClusterEvent {
|
|||
def getAllDataCenters: java.util.Set[String] =
|
||||
scala.collection.JavaConverters.setAsJavaSetConverter(allDataCenters).asJava
|
||||
|
||||
/**
|
||||
* INTERNAL API
|
||||
* @return `true` if more than one `Version` among the members, which
|
||||
* indicates that a rolling update is in progress
|
||||
*/
|
||||
@InternalApi private[akka] def hasMoreThanOneAppVersion: Boolean = {
|
||||
if (members.isEmpty) false
|
||||
else {
|
||||
val v = members.head.appVersion
|
||||
members.exists(_.appVersion != v)
|
||||
}
|
||||
}
|
||||
|
||||
/**
|
||||
* Replace the set of unreachable datacenters with the given set
|
||||
*/
|
||||
|
|
|
|||
|
|
@ -72,7 +72,6 @@ abstract class JoinSeedNodeSpec extends MultiNodeSpec(JoinSeedNodeMultiJvmSpec)
|
|||
List(address(ordinary1), address(ordinary2)).foreach { a =>
|
||||
cluster.state.members.find(_.address == a).get.appVersion should ===(Version("2.0"))
|
||||
}
|
||||
cluster.state.hasMoreThanOneAppVersion should ===(true)
|
||||
|
||||
enterBarrier("after-2")
|
||||
}
|
||||
|
|
|
|||
|
|
@ -105,7 +105,6 @@ class ClusterSpec extends AkkaSpec(ClusterSpec.config) with ImplicitSender {
|
|||
awaitAssert(clusterView.status should ===(MemberStatus.Up))
|
||||
clusterView.self.appVersion should ===(Version("1.2.3"))
|
||||
clusterView.members.find(_.address == selfAddress).get.appVersion should ===(Version("1.2.3"))
|
||||
clusterView.state.hasMoreThanOneAppVersion should ===(false)
|
||||
}
|
||||
|
||||
"reply with InitJoinAck for InitJoin after joining" in {
|
||||
|
|
|
|||
|
|
@ -39,10 +39,34 @@ Additionally you can find advice on @ref:[Persistence - Schema Evolution](../per
|
|||
|
||||
## Cluster Sharding
|
||||
|
||||
During a rolling update, sharded entities receiving traffic may be moved during @ref:[shard rebalancing](../typed/cluster-sharding-concepts.md#shard-rebalancing),
|
||||
to an old or new node in the cluster, based on the pluggable allocation strategy and settings.
|
||||
When an old node is stopped the shards that were running on it are moved to one of the
|
||||
other old nodes remaining in the cluster. The `ShardCoordinator` is itself a cluster singleton.
|
||||
During a rolling update, sharded entities receiving traffic may be moved, based on the pluggable allocation
|
||||
strategy and settings. When an old node is stopped the shards that were running on it are moved to one of the
|
||||
other remaining nodes in the cluster when messages are sent to those shards.
|
||||
|
||||
To make rolling updates as smooth as possible there is a configuration property that defines the version of the
|
||||
application. This is used by rolling update features to distinguish between old and new nodes. For example,
|
||||
the default `LeasShardAllocationStrategy` avoids allocating shards to old nodes during a rolling update.
|
||||
The `LeasShardAllocationStrategy` sees that there is rolling update in progress when there are members with
|
||||
different configured `app-version`.
|
||||
|
||||
To make use of this feature you need to define the `app-version` and increase it for each rolling update.
|
||||
|
||||
```
|
||||
akka.cluster.app-version = 1.2.3
|
||||
```
|
||||
|
||||
To understand which is old and new it compares the version numbers using normal conventions,
|
||||
see @apidoc[akka.util.Version] for more details.
|
||||
|
||||
Rebalance is also disabled during rolling updates, since shards from stopped nodes are anyway supposed to be
|
||||
started on new nodes. Messages to shards that were stopped on the old nodes will allocate corresponding shards
|
||||
on the new nodes, without waiting for rebalance actions.
|
||||
|
||||
You should also enable the @ref:[health check for Cluster Sharding](../typed/cluster-sharding.md#health-check) if
|
||||
you use Akka Management. The readiness check will delay incoming traffic to the node until Sharding has been
|
||||
initialized and can accept messages.
|
||||
|
||||
The `ShardCoordinator` is itself a cluster singleton.
|
||||
To minimize downtime of the shard coordinator, see the strategies about @ref[ClusterSingleton](#cluster-singleton) rolling updates below.
|
||||
|
||||
A few specific changes to sharding configuration require @ref:[a full cluster restart](#cluster-sharding-configuration-change).
|
||||
|
|
@ -54,6 +78,9 @@ it is recommended to upgrade the oldest node last. This way cluster singletons a
|
|||
Otherwise, in the worst case cluster singletons may be migrated from node to node which requires coordination and initialization
|
||||
overhead several times.
|
||||
|
||||
[Kubernetes Deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) with `RollingUpdate`
|
||||
strategy will roll out updates in this preferred order, from newest to oldest.
|
||||
|
||||
## Cluster Shutdown
|
||||
|
||||
### Graceful shutdown
|
||||
|
|
|
|||
|
|
@ -82,20 +82,10 @@ persistent (durable), e.g. with @ref:[Persistence](persistence.md) (or see @ref:
|
|||
location.
|
||||
|
||||
The logic that decides which shards to rebalance is defined in a pluggable shard
|
||||
allocation strategy. The default implementation `ShardCoordinator.LeastShardAllocationStrategy`
|
||||
picks shards for handoff from the `ShardRegion` with most number of previously allocated shards.
|
||||
They will then be allocated to the `ShardRegion` with least number of previously allocated shards,
|
||||
i.e. new members in the cluster.
|
||||
allocation strategy. The default implementation `LeastShardAllocationStrategy` allocates new shards
|
||||
to the `ShardRegion` (node) with least number of previously allocated shards.
|
||||
|
||||
For the `LeastShardAllocationStrategy` there is a configurable threshold (`rebalance-threshold`) of
|
||||
how large the difference must be to begin the rebalancing. The difference between number of shards in
|
||||
the region with most shards and the region with least shards must be greater than the `rebalance-threshold`
|
||||
for the rebalance to occur.
|
||||
|
||||
A `rebalance-threshold` of 1 gives the best distribution and therefore typically the best choice.
|
||||
A higher threshold means that more shards can be rebalanced at the same time instead of one-by-one.
|
||||
That has the advantage that the rebalance process can be quicker but has the drawback that the
|
||||
the number of shards (and therefore load) between different nodes may be significantly different.
|
||||
See also @ref:[Shard allocation](cluster-sharding.md#shard-allocation).
|
||||
|
||||
### ShardCoordinator state
|
||||
|
||||
|
|
|
|||
|
|
@ -486,6 +486,8 @@ Monitoring of each shard region is off by default. Add them by defining the enti
|
|||
akka.cluster.sharding.healthcheck.names = ["counter-1", "HelloWorld"]
|
||||
```
|
||||
|
||||
See also additional information about how to make @ref:[smooth rolling updates](../additional/rolling-updates.md#cluster-sharding).
|
||||
|
||||
## Inspecting cluster sharding state
|
||||
|
||||
Two requests to inspect the cluster state are available:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue