Each build is now over 40mb logs.
A lot of DEBUG logging was left on for test failures that have been
fixed. Added an issue # for ones that are still valid or if if it on
as the test verifies debug
* fail fast if Typed Cluster.sharding.spawn is called several times with different parameters
* fix a bug in ClusterShardingImpl.spawnWithMessageExtractor - actually use allocationStrategy param
* previous solution didn't work becuse the untyped StartEntity
message is sent by untyped sharding itself without the typed envelope
and null was a bit of a hack
* =tkt port WithLogCapturing from akka-http
* =str use WithLogCapturing for very noisy TLSSpec
* =sha use WithLogCapturing to silence noisy CoordinatedShutdownShardingSpec
* There might be one case when the singleton coordinator
hand over might start before the gracful stop of the
region is completed on other node.
* I think this is rare enough to just accept that message
might be sent to wrong location (we don't guarantee anything
more than best effort anyway).
* Safe rolling upgrade should keep the coordinator (oldest)
until last to avoid such races
* Some GetShardHome requests were ignored (by design) during
rebalance and they would be retried later.
* This optimization keeps tracks of such requests and reply
to them immediately after rebalance has been completed and
thereby the buffered messages in the region don't have to
wait for next retry tick.
* use regionTerminationInProgress also during the update since
all GetShardHome requests are not stashed
There exists a race where a cluter node that is being downed seens its
self as the oldest node (as it has had the other nodes removed) and it
takes over the singleton manager sending the real oldest node to go into
the End state meaning that cluster singletons never work again.
This fix simply prevents Member events being given to the Cluster
Manager FSM during a shut down, instread relying on SelfExiting.
This also hardens the test by not downing the node that the current
sharding coordinator is running on as well as fixing a bug in the
probes.
* The real issue that should be fixed is that there seems to be a race
between the CS and the ClusterSingleton observing OldestChanged
and terminating coordinator singleton before the graceful sharding stop is done
* Revert "fix entityPropsFactory id param, #21809"
This reverts commit cd7eae28f6.
* Revert "Merge pull request #24058 from talpr/talpr-24053-add-entity-id-to-sharding-props"
This reverts commit 8417e70460, reversing
changes made to 22e85f869d.
AFAICT there was nothing ensuring the order of messages when sent to the
shard and the region so first checkthat the passivation has happened
before sending another add in the test
Refs #24013
* looks like the ActorSystem is shutdown when leaving
* Included in MultiNodeSpec, i.e. all multi-node tests:
akka.coordinated-shutdown.terminate-actor-system = off
akka.oordinated-shutdown.run-by-jvm-shutdown-hook = off
* Having maxSimultaneousRebalance > rebalanceThreshold in LeastShardAllocationStrategy caused shards "flapping" (deallocation of excessive shards followed by their immediate allocation on the same node)
* since the ordering can change based on the member's status
it's not possible to use ordinary - for removal
* similar issue at a few places where ageOrdering was used
* Sharding only within own team (coordinator is singleton)
* the ddata Replicator used by Sharding must also be only within own team
* added support for Set of roles in ddata Replicator so that can be used
by sharding to specify role + team
* Sharding proxy can route to sharding in another team