Merge pull request #20809 from 2m/wip-#20808-restart-node-2m

#20808 clarify docs on the quarantined node restart
2016-06-22 16:27:45 +03:00 · 2016-06-22 16:27:45 +03:00 · e39255cef0
commit e39255cef0
parent 4b89fcd643 2de90adc02
2 changed files with 18 additions and 18 deletions
--- a/akka-docs/rst/java/cluster-usage.rst
+++ b/akka-docs/rst/java/cluster-usage.rst
@ -147,7 +147,7 @@ status to ``down`` automatically after the configured time of unreachability.

 This is a naïve approach to remove unreachable nodes from the cluster membership. It
 works great for crashes and short transient network partitions, but not for long network
-partitions. Both sides of the network partition will see the other side as unreachable 
+partitions. Both sides of the network partition will see the other side as unreachable
 and after a while remove it from its cluster membership. Since this happens on both
 sides the result is that two separate disconnected clusters have been created. This
 can also happen because of long GC pauses or system overload.
@ -155,14 +155,14 @@ can also happen because of long GC pauses or system overload.
 .. warning::

  We recommend against using the auto-down feature of Akka Cluster in production.
-  This is crucial for correct behavior if you use :ref:`cluster-singleton-java` or 
+  This is crucial for correct behavior if you use :ref:`cluster-singleton-java` or
  :ref:`cluster_sharding_java`, especially together with Akka :ref:`persistence-java`.
-  
-A pre-packaged solution for the downing problem is provided by 
-`Split Brain Resolver <http://doc.akka.io/docs/akka/rp-16s01p03/java/split-brain-resolver.html>`_, 
-which is part of the Lightbend Reactive Platform. If you don’t use RP, you should anyway carefully 
+
+A pre-packaged solution for the downing problem is provided by
+`Split Brain Resolver <http://doc.akka.io/docs/akka/rp-16s01p03/java/split-brain-resolver.html>`_,
+which is part of the Lightbend Reactive Platform. If you don’t use RP, you should anyway carefully
 read the `documentation <http://doc.akka.io/docs/akka/rp-16s01p03/java/split-brain-resolver.html>`_
-of the Split Brain Resolver and make sure that the solution you are using handles the concerns 
+of the Split Brain Resolver and make sure that the solution you are using handles the concerns
 described there.

 .. note:: If you have *auto-down* enabled and the failure detector triggers, you
@ -427,8 +427,8 @@ If system messages cannot be delivered to a node it will be quarantined and then
 cannot come back from ``unreachable``. This can happen if the there are too many
 unacknowledged system messages (e.g. watch, Terminated, remote actor deployment,
 failures of actors supervised by remote parent). Then the node needs to be moved
-to the ``down`` or ``removed`` states and the actor system must be restarted before
-it can join the cluster again.
+to the ``down`` or ``removed`` states and the actor system of the quarantined node
+must be restarted before it can join the cluster again.

 The nodes in the cluster monitor each other by sending heartbeats to detect if a node is
 unreachable from the rest of the cluster. The heartbeat arrival times is interpreted
--- a/akka-docs/rst/scala/cluster-usage.rst
+++ b/akka-docs/rst/scala/cluster-usage.rst
@ -142,7 +142,7 @@ status to ``down`` automatically after the configured time of unreachability.

 This is a naïve approach to remove unreachable nodes from the cluster membership. It
 works great for crashes and short transient network partitions, but not for long network
-partitions. Both sides of the network partition will see the other side as unreachable 
+partitions. Both sides of the network partition will see the other side as unreachable
 and after a while remove it from its cluster membership. Since this happens on both
 sides the result is that two separate disconnected clusters have been created. This
 can also happen because of long GC pauses or system overload.
@ -150,14 +150,14 @@ can also happen because of long GC pauses or system overload.
 .. warning::

  We recommend against using the auto-down feature of Akka Cluster in production.
-  This is crucial for correct behavior if you use :ref:`cluster-singleton-scala` or 
+  This is crucial for correct behavior if you use :ref:`cluster-singleton-scala` or
  :ref:`cluster_sharding_scala`, especially together with Akka :ref:`persistence-scala`.
-  
-A pre-packaged solution for the downing problem is provided by 
-`Split Brain Resolver <http://doc.akka.io/docs/akka/rp-16s01p03/scala/split-brain-resolver.html>`_, 
-which is part of the Lightbend Reactive Platform. If you don’t use RP, you should anyway carefully 
+
+A pre-packaged solution for the downing problem is provided by
+`Split Brain Resolver <http://doc.akka.io/docs/akka/rp-16s01p03/scala/split-brain-resolver.html>`_,
+which is part of the Lightbend Reactive Platform. If you don’t use RP, you should anyway carefully
 read the `documentation <http://doc.akka.io/docs/akka/rp-16s01p03/scala/split-brain-resolver.html>`_
-of the Split Brain Resolver and make sure that the solution you are using handles the concerns 
+of the Split Brain Resolver and make sure that the solution you are using handles the concerns
 described there.

 .. note:: If you have *auto-down* enabled and the failure detector triggers, you
@ -422,8 +422,8 @@ If system messages cannot be delivered to a node it will be quarantined and then
 cannot come back from ``unreachable``. This can happen if the there are too many
 unacknowledged system messages (e.g. watch, Terminated, remote actor deployment,
 failures of actors supervised by remote parent). Then the node needs to be moved
-to the ``down`` or ``removed`` states and the actor system must be restarted before
-it can join the cluster again.
+to the ``down`` or ``removed`` states and the actor system of the quarantined node
+must be restarted before it can join the cluster again.

 The nodes in the cluster monitor each other by sending heartbeats to detect if a node is
 unreachable from the rest of the cluster. The heartbeat arrival times is interpreted