There exists a race where a cluter node that is being downed seens its
self as the oldest node (as it has had the other nodes removed) and it
takes over the singleton manager sending the real oldest node to go into
the End state meaning that cluster singletons never work again.
This fix simply prevents Member events being given to the Cluster
Manager FSM during a shut down, instread relying on SelfExiting.
This also hardens the test by not downing the node that the current
sharding coordinator is running on as well as fixing a bug in the
probes.
* The real issue that should be fixed is that there seems to be a race
between the CS and the ClusterSingleton observing OldestChanged
and terminating coordinator singleton before the graceful sharding stop is done
* Revert "fix entityPropsFactory id param, #21809"
This reverts commit cd7eae28f6.
* Revert "Merge pull request #24058 from talpr/talpr-24053-add-entity-id-to-sharding-props"
This reverts commit 8417e70460, reversing
changes made to 22e85f869d.
AFAICT there was nothing ensuring the order of messages when sent to the
shard and the region so first checkthat the passivation has happened
before sending another add in the test
Refs #24013
* looks like the ActorSystem is shutdown when leaving
* Included in MultiNodeSpec, i.e. all multi-node tests:
akka.coordinated-shutdown.terminate-actor-system = off
akka.oordinated-shutdown.run-by-jvm-shutdown-hook = off
* Having maxSimultaneousRebalance > rebalanceThreshold in LeastShardAllocationStrategy caused shards "flapping" (deallocation of excessive shards followed by their immediate allocation on the same node)
* since the ordering can change based on the member's status
it's not possible to use ordinary - for removal
* similar issue at a few places where ageOrdering was used
* Sharding only within own team (coordinator is singleton)
* the ddata Replicator used by Sharding must also be only within own team
* added support for Set of roles in ddata Replicator so that can be used
by sharding to specify role + team
* Sharding proxy can route to sharding in another team
* Test case covering changing shard id extractor with remember-entities
* This should do the trick
* Feedback addressed
* Docs and migration guide mention
* Correct logic to persist that entity has moved off off shard
* when using remember entities with ddata mode the set of
shards were not saved in durable storage and therefore the
remembered entities were not loaded until the first message
was sent to the shard
* the coordinator stores the set of shards in a durable GSet
* loaded when the coordinator is started and added to the State,
rest is already taken care of via the unallocatedShards Set in
the State
* when new shards are allocated the durable GSet is updated if it
doesn't already contain the shard identifier
* Lazy init of LmdbDurableStore, #22759
* to avoid creating files (and initializing db) when not needed,
e.g. cluster sharding that is not using remember entities
* enable MiMa against 2.5.0
* use OptionVal instead
Changed `akka.cluster.sharding.distributed-data.akka.cluster.sharding.distributed-data.max-data-elements` to `akka.cluster.sharding.distributed-data.max-delta-elements`.