* CoordinatedShutdown that can run tasks for configured phases in order (DAG)
* coordinate handover/shutdown of singleton with cluster exiting/shutdown
* phase config obj with depends-on list
* integrate graceful leaving of sharding in coordinated shutdown
* add timeout and recover
* add some missing artery ports to tests
* leave via CoordinatedShutdown.run
* optionally exit-jvm in last phase
* run via jvm shutdown hook
* send ExitingConfirmed to leader before shutdown of Exiting
to not have to wait for failure detector to mark it as
unreachable before removing
* the unreachable signal is still kept as a safe guard if
message is lost or leader dies
* PhaseClusterExiting vs MemberExited in ClusterSingletonManager
* terminate ActorSystem when cluster shutdown (via Down)
* add more predefined and custom phases
* reference documentation
* migration guide
* problem when the leader order was sys2, sys1, sys3,
then sys3 could not perform it's duties and move Leving sys1 to
Exiting because it was observing sys1 as unreachable
* exclude Leaving with exitingConfirmed from convergence condidtion
* The problem: ACK that was targeted to an old incarnation
was sent to the new, restarted, system with same host:port, and
therefore resulting issues noticed as
"Error encountered while processing system message acknowledgement buffer: [-1 {}] ack: ACK[0, {}]"
when restarting actor system
* The reason:
1. The endpoint reader was about to send OutgoingAck to parent reader,
targeted to the old system.
2. At the same time there is an incoming connection from new system
that triggered TakeOver in the endpoint writer, i.e. replacing
the handle to the connection of the new system.
3. The OutgoingAck is received by the writer, which happily sends it
to the new handle, the new system.
* The solution: Ignore OutgoingAck during the handoff (TakeOver) process.
* Failure detection heartbeating was not performed to joining
nodes, since it was expected that they will become Up first.
* If a joining node is downed before it is changed to Up failure
detection will not be performed for that node. That resulted in
the downed node will not be removed from membership, since the
unreachability signal is used as confirmation that the node is
actually stopped before removing it.