Commit graph

59 commits

Author SHA1 Message Date
Patrik Nordwall
89165badbb
Structured log events, #28207 (#28209)
* Expanding the LogMarker to optionally include Map of additional
  properties
* The name of the marker shows up as `tags` in Kibana
* The properties of the LogMarker are included as MDC entries, which
  Logstash encoder automatically understands
* Implemented with classic eventStream logging so that it can be used
  from all places without dependencies to Slf4j
  * also means that it's possible to subscribe to LogEventWithMarker
    to consume these events via the eventStream
* move marker definitions to RemoteLogMarker and ClusterLogMarker
* marker for dead letters
* marker for leader detained/allowed
* marker for member status changed
* markers for shard allocated and started
* test LogMarker with properties in Slf4jLoggerSpec
* doc note of the LogMarker definitions
2019-12-05 11:36:21 +01:00
Patrik Nordwall
a217d5566e
Remove auto-downing, #27788 (#27855)
* moved to cluster tests, in new package akka.cluster.testkit
* changed config in tests
* migration guide
* documentation clarificiations for Downing and Leaving
* update warnings in Singleton and Sharding
2019-10-03 14:08:43 +02:00
Patrik Nordwall
ddb085255d Fix singleton issue when leaving several nodes, #27487 (#27488)
* Fix singleton issue when leaving several nodes, #27487

* When leaving several nodes at about the same time the new singleton
  could be started before previous had been completely stopped.
* Found two possible ways this could happen.
  * Acting on MemberRemoved that is emitted when the self
    cluster node is shutting down.
  * The HandOverDone confirmation when in Younger state,
    but that node is also Leaving so could be seen as Exiting
    from a third node that is the next singleton.

* keep track of all previous oldest, not only the latest

* Option => List
* Otherwise in BecomingOldest it could transition to Oldest
  when the previous oldest was removed even though the previous-previous wasn't removed yet

* fix failure in ClusterSingletonRestart2Spec

* OldestChanged was not emitted when Exiting member was removed
* The initial membersByAge must also contain Leaving, Exiting members

(cherry picked from commit ee188565b9f3cf2257ebda218cec6af5a4777439)
2019-08-19 15:11:49 +02:00
Christopher Batey
89e269d5d8 Remove catchall silents from prod code (#27432)
* WIP

* Remove catch all silent annocations from prod code
2019-07-30 11:12:23 +02:00
Patrik Nordwall
10d32fceb9 scheduleWithFixedDelay vs scheduleAtFixedRate, #26910
* previous `schedule` method is trying to maintain a fixed average frequency
  over time, but that can result in undesired bursts of scheduled tasks after a long
  GC or if the JVM process has been suspended, same with all other periodic
  scheduled message sending via various Timer APIs
* most of the time "fixed delay" is more desirable
* we can't just change because it's too big behavioral change and some might
  depend on previous behavior
* deprecate the old `schedule` and introduce new `scheduleWithFixedDelay`
  and `scheduleAtFixedRate`, when fixing the deprecation warning users should
  make a concious decision of which behavior to use (scheduleWithFixedDelay in
  most cases)

* Streams
* SchedulerSpec
  * test both fixed delay and fixed rate
* TimerSpec
* FSM and PersistentFSM
* mima
* runnable as second parameter list, also in typed.Scheduler
* IllegalStateException vs SchedulerException
* deprecated annotations
* api and reference docs, all places
* migration guide
2019-06-05 11:38:04 +02:00
Patrik Nordwall
a4a61649f6 add InternalStableApi annotation (#26949) 2019-05-21 16:29:11 +01:00
Johan Andrén
81b1e2ef9b Internal dispatcher to protect against starvation (#26816)
* Allow for dispatcher aliases and define a internal dispatcher
* Test checking dispatcher name
* MiMa for Dispatchers
* Migration guide entry
* No need to have custom dispatcher lookup logic in streams anymore
* Default dispatcher size and migration note about that
* Test checking exact config values...
* Typed receptionist on internal dispatcher
* All internal usages of system.dispatcher gone through
2019-05-02 22:35:25 +02:00
Nicolas Vollmar
e866e3fc70 Discards HandOverToMe in state End to avoid unhandled message warning #26793 2019-04-23 14:43:11 +02:00
Patrik Nordwall
a5e9741d35 replace unicode arrows again (#26732) 2019-04-15 15:40:26 +00:00
Johan Andrén
d699332b53 akka-cluster-tools compiler warnings as fatal errors (#26647) 2019-04-04 15:35:18 +02:00
Arnout Engelen
a39ac61265 Move Lease usage settings inside akka-coordination (#26637) 2019-03-29 14:27:08 +01:00
Patrik Nordwall
1c93e31bf7 revert rename gotoOldest 2019-03-29 13:18:28 +01:00
Christopher Batey
65ccada280 Lease API + use in cluster singleton and sharding, #26480 (#26629)
* lease api
* Cluster singleton manager with lease
* Refactor OldestData to use option for actor reference
* Sharding with lease
* Docs for singleton and sharding lease + config for sharding lease
* Have ddata shard wait until lease is acquired before getting state
2019-03-28 13:31:56 +01:00
Auto Format
75579bed17 format source with scalafmt, #26511 2019-03-15 10:23:46 +01:00
Auto Format
ce404e4f53 format source with scalafmt 2019-03-11 16:58:55 +01:00
Patrik Nordwall
5c96a5f556 replace unicode arrows
* ⇒, →, ←
* because we don't want to show them in documentation snippets and
  then it's complicated to avoid that when snippets are
  located in src/test/scala in individual modules
* dont replace object `→` in FSM.scala and PersistentFSM.scala
2019-03-11 16:58:51 +01:00
Patrik Nordwall
ddada9a8e1
Stop singleton and shards when self MemberDowned, #26336 (#26339)
* Stop singleton when self MemberDowned, #26336
  * It's safer to stop singleton instance early in case of downing.
  * Instead of waiting for MemberRemoved and trying to hand over.
* Stop ShardRegion when self MemberDowned, #26336
* Upper bound when waiting for seen in shutdownSelfWhenDown, #26336
2019-02-12 15:05:33 +01:00
Arnaud Burlet
66a2074a8b Fix log singleton message #26329 2019-02-06 09:05:48 +01:00
kerr
bdc90052aa Update headers from 2018 to 2019 once for all. (#26165)
* Add CopyrightHeader support for sbt-boilerplate plugin.
* Add CopyrightHeader support for `*.proto` files.
* Add regex match for both `–` and `-` for CopyrightHeader.
* Add CopyrightHeader support for sbt build files.
* Update copyright from 2018 to 2019.
2019-01-02 11:55:26 +01:00
Patrik Nordwall
90bc4cfa3e
Improvements of singleton leaving scenario, #25639 (#25710)
* Testing of singleton leaving
* gossip optimization, exiting change to two oldest per role
* hardening ClusterSingletonManagerIsStuck restart, increase ClusterSingletonManagerIsStuck
2018-11-09 09:42:48 +01:00
kerr
fafc59b19d update headers to regular comment (#25807) 2018-10-29 05:19:37 -04:00
Johan Andrén
8ed4f5abab Undeprecate the config, add a note in cluster singleton 2018-09-04 14:21:25 +02:00
Kazuhiro Sera
482eaea122 Fix several minor typos detected by github.com/client9/misspell (#25448)
* Fix several minor typos detected by github.com/client9/misspell

* Revert s/erminater/erminator/ in /ActorSystemSpec
2018-08-21 11:02:37 +09:00
Nicolas Vollmar
28746a4cfe Ignore possible state change while waiting for removal #25274 2018-06-27 09:06:32 +02:00
Christopher Batey
0380cc517a Cluster singleton manager: don't send member events to FSM during shutdown (#24236)
There exists a race where a cluter node that is being downed seens its
self as the oldest node (as it has had the other nodes removed) and it
takes over the singleton manager sending the real oldest node to go into
the End state meaning that cluster singletons never work again.

This fix simply prevents Member events being given to the Cluster
Manager FSM during a shut down, instread relying on SelfExiting.

This also hardens the test by not downing the node that the current
sharding coordinator is running on as well as fixing a bug in the
probes.
2018-01-05 09:47:43 +01:00
Christopher Batey
009214ae07
Update copyright to 2018 (#24241) 2018-01-04 17:26:29 +00:00
Patrik Nordwall
fa3da328be Run all CoordinatedShutdown phases also when downing, #24048 2017-12-04 11:05:22 +01:00
Patrik Nordwall
cb08535e7d use right youngest when moving to Up, #23582
* also confirm TakeOverFromMe when singleton already in oldest state
2017-09-04 16:02:23 +02:00
Patrik Nordwall
6ed3295acd Merge branch 'master' into wip-multi-dc-merge-master-patriknw 2017-08-31 10:51:12 +02:00
Johan Andrén
6cd3b17854 Annotated extendable cluster tool actors with DoNotInherit (#23335) 2017-07-26 10:42:13 +02:00
Johan Andrén
9c7e8d027a Renamed/moved the self data center setting #23312 (#23344) 2017-07-12 11:47:32 +01:00
Patrik Nordwall
bb9549263e Rename team to data center, #23275 2017-07-04 17:11:21 +02:00
Patrik Nordwall
2044c1712b Support cross team in ClusterSingletonProxy, #23230 2017-07-04 10:47:45 +02:00
Arnout Engelen
ccea5a0eac Make cluster singleton DC aware, #23230 2017-07-04 10:21:27 +02:00
Patrik Nordwall
84ade6fdc3 add CoordinatedShutdown, #21537
* CoordinatedShutdown that can run tasks for configured phases in order (DAG)
* coordinate handover/shutdown of singleton with cluster exiting/shutdown
* phase config obj with depends-on list
* integrate graceful leaving of sharding in coordinated shutdown
* add timeout and recover
* add some missing artery ports to tests
* leave via CoordinatedShutdown.run
* optionally exit-jvm in last phase
* run via jvm shutdown hook
* send ExitingConfirmed to leader before shutdown of Exiting
  to not have to wait for failure detector to mark it as
  unreachable before removing
* the unreachable signal is still kept as a safe guard if
  message is lost or leader dies
* PhaseClusterExiting vs MemberExited in ClusterSingletonManager
* terminate ActorSystem when cluster shutdown (via Down)
* add more predefined and custom phases
* reference documentation
* migration guide
* problem when the leader order was sys2, sys1, sys3,
  then sys3 could not perform it's duties and move Leving sys1 to
  Exiting because it was observing sys1 as unreachable
* exclude Leaving with exitingConfirmed from convergence condidtion
2017-01-16 09:01:57 +01:00
Philippus Baalman
6c7085252a extended copyright into 2017 2017-01-04 17:37:15 +01:00
Patrik Nordwall
a7eed00948 Merge pull request #21813 from akka/wip-21518-patriknw
minor cleanup of cluster.subscribe usage
2016-11-16 21:33:02 +01:00
Patrik Nordwall
88a6bdb059 Cluster singleton testing (#21803)
* add another singleton restart test

* remove from singleton sorted set with unique address

* because the upNumber used by the Ordering can change
2016-11-11 15:10:30 -08:00
Patrik Nordwall
d9e873644e minor cleanup of cluster.subscribe usage 2016-11-08 15:14:52 +01:00
Patrik Nordwall
9be7df1527 send terminationMessage to singleton when leaving last, #21592 (#21654)
* send terminationMessage to singleton when leaving last, #21592

* When leaving last node, i.e. no newOldestOption, the manager was
  just stopped. The change is to send the terminationMessage also
  in this case and wait until the singleton actor is terminated
  before stopping the manager.
* Also changed so that the singleton is stopped immediately when
  cluster has been terminated when last node is leaving, i.e.
  no newOldestOption. Previously it retried until maxTakeOverRetries
  before stopping.
* More comprehensive test of this scenario in ClusterSingletonManagerLeaveSpec

* increase test timeout
2016-10-28 14:23:18 +02:00
Patrik Nordwall
e6068a0f5a Fix regression in Cluster Singleton, #21236
* When the test fails the node is removed from the membership
  twice, which triggers two OldestChanged cycles, but in
  the 2.4.9 change https://github.com/akka/akka/pull/21152/files#diff-f0ae95c926a050aecf45dba3e08d1c77L669
  the singleton manager always goes to End (stop) when it has been Oldest
* This fix restores the previous behavior for this scenario
2016-08-23 19:08:49 +02:00
Patrik Nordwall
0c4d4c37ba cluster singleton improvements, #20942
* track nodes by UniqueAddress in Cluster Singleton, #20942
* reply with HandOverDone from new incarnation, #20942
* confirm as terminated immediately when new incarnation joins, #20942 instead of waiting for failure detector to mark it as unreachable this will speed-up removal when restarting cluster node with same hostname:port
2016-08-19 11:56:55 +02:00
Björn Antonsson
c66ce62d63 Update to a working version of Scalariform 2016-06-02 22:12:36 +02:00
Johan Andrén
5671927cf1 clu #20309 API for pluggable cluster downing 2016-04-18 15:06:05 +02:00
Johannes Rudolph
b6cbc7f13a =all remove unused imports 2016-02-23 20:29:22 +01:00
Johan Andrén
62e30b3c08 Update copyrights and links to the new company name #19851 2016-02-23 12:58:39 +01:00
Prayag Verma
b7783968a0 =pro #19068 All copyrights ranges and single years updated to a range ending in 2016 2016-01-25 10:20:30 +01:00
Patrik Nordwall
c7c187f6b7 =clu replace Set -- with diff and ++ with union
* better performance according to
  https://docs.google.com/presentation/d/1Qjryxoe-fYEM8ZPhM-98LKfbhnRcn5eAEMNlVVnixsA/pub
2015-11-06 14:48:17 +01:00
Patrik Nordwall
9380983d3c =clu #18554 Make oldest assignment deterministic when joining
* the reported issue is fixed by the immediate leaderActions
  (moving to Up)  when joining the first node to itself
* the other changes are precautions just in case
2015-10-21 07:53:14 +02:00
Patrik Nordwall
6d036ca00c =clt #16948 Use min retries for singleton leaving scenario"
* In 2.4 we derive the number of hand-over/take-over retries from
  the removal margin, but we decided to set that to 0 by default, since
  it is intended for network partition scenarios. maxTakeOverRetries
  became 1. So there must be also be a  min number of retries property.
* The test failed for the leaving scenario because the singleton
  instance was stopped hard without sending the terminationMessage when
  the maxTakeOverRetries was exceeded.
2015-09-17 10:38:52 +02:00