Documentation for Sharding rolling update (#29666)
This commit is contained in:
parent
2caa560aab
commit
90b79144e5
6 changed files with 36 additions and 32 deletions
|
|
@ -202,19 +202,6 @@ object ClusterEvent {
|
||||||
def getAllDataCenters: java.util.Set[String] =
|
def getAllDataCenters: java.util.Set[String] =
|
||||||
scala.collection.JavaConverters.setAsJavaSetConverter(allDataCenters).asJava
|
scala.collection.JavaConverters.setAsJavaSetConverter(allDataCenters).asJava
|
||||||
|
|
||||||
/**
|
|
||||||
* INTERNAL API
|
|
||||||
* @return `true` if more than one `Version` among the members, which
|
|
||||||
* indicates that a rolling update is in progress
|
|
||||||
*/
|
|
||||||
@InternalApi private[akka] def hasMoreThanOneAppVersion: Boolean = {
|
|
||||||
if (members.isEmpty) false
|
|
||||||
else {
|
|
||||||
val v = members.head.appVersion
|
|
||||||
members.exists(_.appVersion != v)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Replace the set of unreachable datacenters with the given set
|
* Replace the set of unreachable datacenters with the given set
|
||||||
*/
|
*/
|
||||||
|
|
|
||||||
|
|
@ -72,7 +72,6 @@ abstract class JoinSeedNodeSpec extends MultiNodeSpec(JoinSeedNodeMultiJvmSpec)
|
||||||
List(address(ordinary1), address(ordinary2)).foreach { a =>
|
List(address(ordinary1), address(ordinary2)).foreach { a =>
|
||||||
cluster.state.members.find(_.address == a).get.appVersion should ===(Version("2.0"))
|
cluster.state.members.find(_.address == a).get.appVersion should ===(Version("2.0"))
|
||||||
}
|
}
|
||||||
cluster.state.hasMoreThanOneAppVersion should ===(true)
|
|
||||||
|
|
||||||
enterBarrier("after-2")
|
enterBarrier("after-2")
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -105,7 +105,6 @@ class ClusterSpec extends AkkaSpec(ClusterSpec.config) with ImplicitSender {
|
||||||
awaitAssert(clusterView.status should ===(MemberStatus.Up))
|
awaitAssert(clusterView.status should ===(MemberStatus.Up))
|
||||||
clusterView.self.appVersion should ===(Version("1.2.3"))
|
clusterView.self.appVersion should ===(Version("1.2.3"))
|
||||||
clusterView.members.find(_.address == selfAddress).get.appVersion should ===(Version("1.2.3"))
|
clusterView.members.find(_.address == selfAddress).get.appVersion should ===(Version("1.2.3"))
|
||||||
clusterView.state.hasMoreThanOneAppVersion should ===(false)
|
|
||||||
}
|
}
|
||||||
|
|
||||||
"reply with InitJoinAck for InitJoin after joining" in {
|
"reply with InitJoinAck for InitJoin after joining" in {
|
||||||
|
|
|
||||||
|
|
@ -39,10 +39,34 @@ Additionally you can find advice on @ref:[Persistence - Schema Evolution](../per
|
||||||
|
|
||||||
## Cluster Sharding
|
## Cluster Sharding
|
||||||
|
|
||||||
During a rolling update, sharded entities receiving traffic may be moved during @ref:[shard rebalancing](../typed/cluster-sharding-concepts.md#shard-rebalancing),
|
During a rolling update, sharded entities receiving traffic may be moved, based on the pluggable allocation
|
||||||
to an old or new node in the cluster, based on the pluggable allocation strategy and settings.
|
strategy and settings. When an old node is stopped the shards that were running on it are moved to one of the
|
||||||
When an old node is stopped the shards that were running on it are moved to one of the
|
other remaining nodes in the cluster when messages are sent to those shards.
|
||||||
other old nodes remaining in the cluster. The `ShardCoordinator` is itself a cluster singleton.
|
|
||||||
|
To make rolling updates as smooth as possible there is a configuration property that defines the version of the
|
||||||
|
application. This is used by rolling update features to distinguish between old and new nodes. For example,
|
||||||
|
the default `LeasShardAllocationStrategy` avoids allocating shards to old nodes during a rolling update.
|
||||||
|
The `LeasShardAllocationStrategy` sees that there is rolling update in progress when there are members with
|
||||||
|
different configured `app-version`.
|
||||||
|
|
||||||
|
To make use of this feature you need to define the `app-version` and increase it for each rolling update.
|
||||||
|
|
||||||
|
```
|
||||||
|
akka.cluster.app-version = 1.2.3
|
||||||
|
```
|
||||||
|
|
||||||
|
To understand which is old and new it compares the version numbers using normal conventions,
|
||||||
|
see @apidoc[akka.util.Version] for more details.
|
||||||
|
|
||||||
|
Rebalance is also disabled during rolling updates, since shards from stopped nodes are anyway supposed to be
|
||||||
|
started on new nodes. Messages to shards that were stopped on the old nodes will allocate corresponding shards
|
||||||
|
on the new nodes, without waiting for rebalance actions.
|
||||||
|
|
||||||
|
You should also enable the @ref:[health check for Cluster Sharding](../typed/cluster-sharding.md#health-check) if
|
||||||
|
you use Akka Management. The readiness check will delay incoming traffic to the node until Sharding has been
|
||||||
|
initialized and can accept messages.
|
||||||
|
|
||||||
|
The `ShardCoordinator` is itself a cluster singleton.
|
||||||
To minimize downtime of the shard coordinator, see the strategies about @ref[ClusterSingleton](#cluster-singleton) rolling updates below.
|
To minimize downtime of the shard coordinator, see the strategies about @ref[ClusterSingleton](#cluster-singleton) rolling updates below.
|
||||||
|
|
||||||
A few specific changes to sharding configuration require @ref:[a full cluster restart](#cluster-sharding-configuration-change).
|
A few specific changes to sharding configuration require @ref:[a full cluster restart](#cluster-sharding-configuration-change).
|
||||||
|
|
@ -54,6 +78,9 @@ it is recommended to upgrade the oldest node last. This way cluster singletons a
|
||||||
Otherwise, in the worst case cluster singletons may be migrated from node to node which requires coordination and initialization
|
Otherwise, in the worst case cluster singletons may be migrated from node to node which requires coordination and initialization
|
||||||
overhead several times.
|
overhead several times.
|
||||||
|
|
||||||
|
[Kubernetes Deployments](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) with `RollingUpdate`
|
||||||
|
strategy will roll out updates in this preferred order, from newest to oldest.
|
||||||
|
|
||||||
## Cluster Shutdown
|
## Cluster Shutdown
|
||||||
|
|
||||||
### Graceful shutdown
|
### Graceful shutdown
|
||||||
|
|
|
||||||
|
|
@ -82,20 +82,10 @@ persistent (durable), e.g. with @ref:[Persistence](persistence.md) (or see @ref:
|
||||||
location.
|
location.
|
||||||
|
|
||||||
The logic that decides which shards to rebalance is defined in a pluggable shard
|
The logic that decides which shards to rebalance is defined in a pluggable shard
|
||||||
allocation strategy. The default implementation `ShardCoordinator.LeastShardAllocationStrategy`
|
allocation strategy. The default implementation `LeastShardAllocationStrategy` allocates new shards
|
||||||
picks shards for handoff from the `ShardRegion` with most number of previously allocated shards.
|
to the `ShardRegion` (node) with least number of previously allocated shards.
|
||||||
They will then be allocated to the `ShardRegion` with least number of previously allocated shards,
|
|
||||||
i.e. new members in the cluster.
|
|
||||||
|
|
||||||
For the `LeastShardAllocationStrategy` there is a configurable threshold (`rebalance-threshold`) of
|
See also @ref:[Shard allocation](cluster-sharding.md#shard-allocation).
|
||||||
how large the difference must be to begin the rebalancing. The difference between number of shards in
|
|
||||||
the region with most shards and the region with least shards must be greater than the `rebalance-threshold`
|
|
||||||
for the rebalance to occur.
|
|
||||||
|
|
||||||
A `rebalance-threshold` of 1 gives the best distribution and therefore typically the best choice.
|
|
||||||
A higher threshold means that more shards can be rebalanced at the same time instead of one-by-one.
|
|
||||||
That has the advantage that the rebalance process can be quicker but has the drawback that the
|
|
||||||
the number of shards (and therefore load) between different nodes may be significantly different.
|
|
||||||
|
|
||||||
### ShardCoordinator state
|
### ShardCoordinator state
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -486,6 +486,8 @@ Monitoring of each shard region is off by default. Add them by defining the enti
|
||||||
akka.cluster.sharding.healthcheck.names = ["counter-1", "HelloWorld"]
|
akka.cluster.sharding.healthcheck.names = ["counter-1", "HelloWorld"]
|
||||||
```
|
```
|
||||||
|
|
||||||
|
See also additional information about how to make @ref:[smooth rolling updates](../additional/rolling-updates.md#cluster-sharding).
|
||||||
|
|
||||||
## Inspecting cluster sharding state
|
## Inspecting cluster sharding state
|
||||||
|
|
||||||
Two requests to inspect the cluster state are available:
|
Two requests to inspect the cluster state are available:
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue