Commit graph

11 commits

Author SHA1 Message Date
Patrik Nordwall
84ade6fdc3 add CoordinatedShutdown, #21537
* CoordinatedShutdown that can run tasks for configured phases in order (DAG)
* coordinate handover/shutdown of singleton with cluster exiting/shutdown
* phase config obj with depends-on list
* integrate graceful leaving of sharding in coordinated shutdown
* add timeout and recover
* add some missing artery ports to tests
* leave via CoordinatedShutdown.run
* optionally exit-jvm in last phase
* run via jvm shutdown hook
* send ExitingConfirmed to leader before shutdown of Exiting
  to not have to wait for failure detector to mark it as
  unreachable before removing
* the unreachable signal is still kept as a safe guard if
  message is lost or leader dies
* PhaseClusterExiting vs MemberExited in ClusterSingletonManager
* terminate ActorSystem when cluster shutdown (via Down)
* add more predefined and custom phases
* reference documentation
* migration guide
* problem when the leader order was sys2, sys1, sys3,
  then sys3 could not perform it's duties and move Leving sys1 to
  Exiting because it was observing sys1 as unreachable
* exclude Leaving with exitingConfirmed from convergence condidtion
2017-01-16 09:01:57 +01:00
Philippus Baalman
6c7085252a extended copyright into 2017 2017-01-04 17:37:15 +01:00
Johan Andrén
8ae0c9a888 Use long uid in artery remoting and cluster #20644 2016-09-26 15:34:59 +02:00
Björn Antonsson
c66ce62d63 Update to a working version of Scalariform 2016-06-02 22:12:36 +02:00
Patrik Nordwall
96b68f6437 rem #19780: Skip acks during connection handoff
* The problem: ACK that was targeted to an old incarnation
  was sent to the new, restarted, system with same host:port, and
  therefore resulting issues noticed as
  "Error encountered while processing system message acknowledgement buffer: [-1 {}] ack: ACK[0, {}]"
  when restarting actor system

* The reason:

  1. The endpoint reader was about to send OutgoingAck to parent reader,
     targeted to the old system.
  2. At the same time there is an incoming connection from new system
     that triggered TakeOver in the endpoint writer, i.e. replacing
     the handle to the connection of the new system.
  3. The OutgoingAck is received by the writer, which happily sends it
     to the new handle, the new system.

* The solution: Ignore OutgoingAck during the handoff (TakeOver) process.
2016-03-21 08:55:19 +01:00
Johan Andrén
62e30b3c08 Update copyrights and links to the new company name #19851 2016-02-23 12:58:39 +01:00
Prayag Verma
b7783968a0 =pro #19068 All copyrights ranges and single years updated to a range ending in 2016 2016-01-25 10:20:30 +01:00
Patrik Nordwall
4d64901228 =clu #19274 failure detection of joining/down member status
* Failure detection heartbeating was not performed to joining
  nodes, since it was expected that they will become Up first.
* If a joining node is downed before it is changed to Up failure
  detection will not be performed for that node. That resulted in
  the downed node will not be removed from membership, since the
  unreachability signal is used as confirmation that the node is
  actually stopped before removing it.
2015-12-26 11:30:18 +01:00
Julian Tescher
00f6a58e7c Changes all occurances of Typesafe copyright to extend to 2015 2015-03-10 14:12:19 -07:00
Patrik Nordwall
30df518421 =tes Use ConversionCheckedTripleEquals 2015-03-10 08:17:03 +01:00
Patrik Nordwall
a2b762ad53 =clu #16224 Add test for cluster node restart 2014-11-10 15:12:14 +01:00