* Testing of singleton leaving
* gossip optimization, exiting change to two oldest per role
* hardening ClusterSingletonManagerIsStuck restart, increase ClusterSingletonManagerIsStuck
* Introduce 'MemberDowned' member event
Compatiblity note: MemberEvent is a sealed trait, so it is debatable whether
it is acceptable to introduce a new member.
* Be more conservative (more like leaving), add test
The test failed timing out waiting for a cluster to remove a member. It
was nearly done. This change uses the common configuration used for
cluster tests to speed up membership changes rather than increasing the
timeout.
Each build is now over 40mb logs.
A lot of DEBUG logging was left on for test failures that have been
fixed. Added an issue # for ones that are still valid or if if it on
as the test verifies debug
* [#24294] Fix DistributedPubSubMediator not unsubscribing actors from topics when they terminate.
* Removed rogue "with DeadLetterProbe"
* Update DistributedPubSubMediator.scala
* Always add sender of GetContacts to client interactions
* Handover clients when receptionist leaves the cluster
* Revision based on code review
* Cluster receptionist only tracks connected clients
There exists a race where a cluter node that is being downed seens its
self as the oldest node (as it has had the other nodes removed) and it
takes over the singleton manager sending the real oldest node to go into
the End state meaning that cluster singletons never work again.
This fix simply prevents Member events being given to the Cluster
Manager FSM during a shut down, instread relying on SelfExiting.
This also hardens the test by not downing the node that the current
sharding coordinator is running on as well as fixing a bug in the
probes.
* looks like the ActorSystem is shutdown when leaving
* Included in MultiNodeSpec, i.e. all multi-node tests:
akka.coordinated-shutdown.terminate-actor-system = off
akka.oordinated-shutdown.run-by-jvm-shutdown-hook = off
* MemberRemoved must be published before MemberUp, e.g. when restarted
in other DC
* remove from failureDetector when receiving gossip with new member,
not only new joining member
* increase timeout in MultiDcSingletonManagerSpec
* since the ordering can change based on the member's status
it's not possible to use ordinary - for removal
* similar issue at a few places where ageOrdering was used
* Sharding only within own team (coordinator is singleton)
* the ddata Replicator used by Sharding must also be only within own team
* added support for Set of roles in ddata Replicator so that can be used
by sharding to specify role + team
* Sharding proxy can route to sharding in another team
The initial contact paths should be assumed as published as that's what client subscribers will receive upon subscription. Any changes to contact points are a delta to that.