* some cluster logging improvements
* most logger names are actually good, when using ActorLogging since
config can be setup on the package (prefix)
* override logSource when StageLogging is used
* replace system.log with more specific logger
* see scenario in added test, extracted from debug logs "New gossip published"
from MultiDcSplitBrainSpec
* it publised UnreachableDataCenter even though it had already published that,
when a new cross link is added as unreachable in the reachability table
* don't know why the implementation of diffUnreachableDataCenter/isReachable was
so complicated, but no tests are failing for the changed implementation
So now we can compile akka-distributed-data with
-Xfatal-warnings - though I'm not yet sure about
enabling the (other) undisciplineScalacOptions
* Fix multi-node silencing
* Fix scaladoc warnings
* Introduce annotation to declare ccompat use
* Add explicit toString
* Fix deprecation on 2.13
* Move 'immutable' ccompat helpers to shared ccompat package
* Add MiMa for internal scala 2.13 compatibility class
* Internal API markers
* Fix scaladoc generation
Got bitten by https://github.com/scala/bug/issues/11021
* ⇒, →, ←
* because we don't want to show them in documentation snippets and
then it's complicated to avoid that when snippets are
located in src/test/scala in individual modules
* dont replace object `→` in FSM.scala and PersistentFSM.scala
fix akka-actor-tests compile errors
some tests still fail though
Fix test failures in akka-actor-test
Manually work arround missing implicit Factory[Nothing, Seq[Nothing]]
see https://github.com/scala/scala-collection-compat/issues/137
akka-remote scalafix changes
Fix shutdownAll compile error
test:akka-remote scalafix changes
akka-multi-node-testkit scalafix
Fix akka-remote-tests multi-jvm compile errors
akka-stream-tests/test:scalafix
Fix test:akka-stream-tests
Crude implementation of ByteString.map
scalafix akka-actor-typed, akka-actor-typed-tests
akka-actor-typed-tests compile and succeed
scalafix akka-camel
scalafix akka-cluster
akka-cluster compile & test
scalafix akka-cluster-metrics
Fix akka-cluster-metrics
scalafix akka-cluster-tools
akka-cluster-tools compile and test
scalafix akka-distributed-data
akka-distributed-data fixes
scalafix akka-persistence
scalafix akka-cluster-sharding
fix akka-cluster-sharding
scalafix akka-contrib
Fix akka-cluster-sharding-typed test
scalafix akka-docs
Use scala-stm 0.9 (released for M5)
akka-docs
Remove dependency on collections-compat
Cherry-pick the relevant constructs to our own
private utils
Shorten 'scala.collections.immutable' by importing it
Duplicate 'immutable' imports
Use 'foreach' on futures
Replace MapLike with regular Map
Internal API markers
Simplify ccompat by moving PackageShared into object
Since we don't currently need to differentiate between 2.11 and
Avoid relying on 'union' (and ++) being left-biased
Fix akka-actor/doc by removing -Ywarn-unused
Make more things more private
Copyright headers
Use 'unsorted' to go from SortedSet to Set
Duplicate import
Use onComplete rather than failed.foreach
Clarify why we partly duplicate scala-collection-compat
* Add CopyrightHeader support for sbt-boilerplate plugin.
* Add CopyrightHeader support for `*.proto` files.
* Add regex match for both `–` and `-` for CopyrightHeader.
* Add CopyrightHeader support for sbt build files.
* Update copyright from 2018 to 2019.
* Introduce 'MemberDowned' member event
Compatiblity note: MemberEvent is a sealed trait, so it is debatable whether
it is acceptable to introduce a new member.
* Be more conservative (more like leaving), add test
Test would fail picking up the reachable from the previous unsplit
as it is a new probe.
Also change barrierCounter to split/unsplit so easier to see
where the failure is on a barrier fail
* MemberRemoved must be published before MemberUp, e.g. when restarted
in other DC
* remove from failureDetector when receiving gossip with new member,
not only new joining member
* increase timeout in MultiDcSingletonManagerSpec
* Cluster management (join, leave, etc)
* Cluster membership subscriptions (MemberUp, MemberRemoved, etc)
* New SelfUp and SelfRemoved events
* change signature of awaitAssert to return the value (not binary compatible)
* Cluster singleton api
* the crossDcFailureDetector was not connected to the reachability table
* additional test by listen for {Reachable/Unreachable}DataCenter events in split spec
* missing Java API for getUnreachableDataCenters in CurrentClusterState
* move methods that depends on selfUniqueAddress and selfDc
to a separate MembershipState class, which also holds the
latest gossip
* this removes the need to pass in the parameters from everywhere and
makes it easier to cache some results
* makes it clear that those parameters are always selfUniqueAddress
and selfDc, instead of some arbitary node/dc
* Guarantee no sneaky type puts more teams in the role list
* Leader per team and initial tests
* MiMa filters
* Second iteration (not working though)
* Verbose gossip logging etc.
* Gossip to team-nodes even if there is inter-team unreachability
* More work ...
* Marking removed nodes with tombstones in Gossip
* More test coverage for Gossip.remove
* Bug failing other multi-node tests squashed
* Multi-node test for team-split
* Review fixes - only prune tombstones on leader ticks
* Clean code is happy code.
* All I want is for MiMa to be my friend
* These constants are internal
* Making the formatting gods happy
* I used the wrong reachability for ignoring gossip :/
* Still hadn't quite gotten how reachability was supposed to work
* Review feedback applied
* Cross-team downing should still work
* Actually prune tombstones in the prune tombstones method ...
* Another round against reachability. Reachability leading with 15 - 2 so far.
* properly shutdown ArteryTransport using CoordinatedShutdown, #22671
* The shutdownHook changed hasBeenShutdown flag to true, and then when
the transport.shutdown was invoked the shutdown sequence was ignored
until it was too late, ActorSystem already terminated.
* Also improved the cluster shutdown tasks when the cluster node had not
joined
* CoordinatedShutdownLeave explicit events
* CoordinatedShutdown that can run tasks for configured phases in order (DAG)
* coordinate handover/shutdown of singleton with cluster exiting/shutdown
* phase config obj with depends-on list
* integrate graceful leaving of sharding in coordinated shutdown
* add timeout and recover
* add some missing artery ports to tests
* leave via CoordinatedShutdown.run
* optionally exit-jvm in last phase
* run via jvm shutdown hook
* send ExitingConfirmed to leader before shutdown of Exiting
to not have to wait for failure detector to mark it as
unreachable before removing
* the unreachable signal is still kept as a safe guard if
message is lost or leader dies
* PhaseClusterExiting vs MemberExited in ClusterSingletonManager
* terminate ActorSystem when cluster shutdown (via Down)
* add more predefined and custom phases
* reference documentation
* migration guide
* problem when the leader order was sys2, sys1, sys3,
then sys3 could not perform it's duties and move Leving sys1 to
Exiting because it was observing sys1 as unreachable
* exclude Leaving with exitingConfirmed from convergence condidtion
* the reported issue is fixed by the immediate leaderActions
(moving to Up) when joining the first node to itself
* the other changes are precautions just in case
Two issues:
1) ShardRegion actor must stop itself when the node is shutting down,
ie. when receiving MemberRemoved(selfAddress)
2) ShardCoordinator must not persist anything when the node is shutting
down. MemberRemoved of other shard regions will trigger Terminated,
which must not be persisted, because then the next coordinator will
replay those events and end up in wrong state. This is a problem
announced itself when using leaving as illustrated in the new test.
To solve the second issue I have added a new ClusterShuttingDown event
that is published before the MemberRemoved events. Note that Terminated
is triggered by MemberRemoved.
(cherry picked from commit 1b272c72597beece9d93f0054f4b58e3d25f9ae2)
* The leader is selected by picking the first reachable member, but in
#13875 we had to let the self member be unreachable in the Reachability
table and that was not considered in the logic of the leader selection.
* That means changed behavior that is unwanted, especially when there
is only one node left the leader could be evaluated to None instead
of Some(selfUniqueAddress).
* Note that #13875 has not been released yet.
* Skip observations from downed node (quarantined is marked down immediately)
in convergence check
* Skip observations from downed node when picking "reachable" targets for gossip.
* This also means that we must accept gossip with own node marked as unreachable,
but that should not be spread to the external membership events.