WIP cluster usage - Swap content from Classic pages to Typed #24717 (#27708)

This commit is contained in:
Helena Edelson 2019-09-18 11:01:59 -07:00 committed by GitHub
parent b2b948ff0e
commit edad69b38c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
25 changed files with 1095 additions and 994 deletions

View file

@ -33,3 +33,6 @@ RedirectMatch 301 ^(.*)/howto\.html$ https://doc.akka.io/docs/akka/2.5/howto.htm
RedirectMatch 301 ^(.*)/common/duration\.html$ https://doc.akka.io/docs/akka/2.5/common/duration.html RedirectMatch 301 ^(.*)/common/duration\.html$ https://doc.akka.io/docs/akka/2.5/common/duration.html
RedirectMatch 301 ^(.*)/futures\.html$ https://doc.akka.io/docs/akka/2.5/futures.html RedirectMatch 301 ^(.*)/futures\.html$ https://doc.akka.io/docs/akka/2.5/futures.html
RedirectMatch 301 ^(.*)/java8-compat\.html$ https://doc.akka.io/docs/akka/2.5/java8-compat.html RedirectMatch 301 ^(.*)/java8-compat\.html$ https://doc.akka.io/docs/akka/2.5/java8-compat.html
RedirectMatch 301 ^(.*)/common/cluster\.html$ $1/typed/cluster-specification.html
RedirectMatch 301 ^(.*)/additional/observability\.html$ $1/additional/operations.html

View file

@ -5,11 +5,8 @@
@@@ index @@@ index
* [Packaging](packaging.md) * [Packaging](packaging.md)
* [Operating, Managing, Observability](operations.md)
* [Deploying](deploying.md) * [Deploying](deploying.md)
* [Rolling Updates](rolling-updates.md) * [Rolling Updates](rolling-updates.md)
* [Observability](observability.md)
@@@
Information on running and managing Akka applications can be found in @@@
the [Akka Management](https://doc.akka.io/docs/akka-management/current/) project documentation.

View file

@ -1,3 +0,0 @@
# Monitoring
Aside from log monitoring and the monitoring provided by your APM or platform provider, [Lightbend Telemetry](https://developer.lightbend.com/docs/telemetry/current/instrumentations/akka/akka.html), available through a [Lightbend Platform Subscription](https://www.lightbend.com/lightbend-platform-subscription), can provide additional insights in the run-time characteristics of your application, including metrics, events, and distributed tracing for Akka Actors, Cluster, HTTP, and more.

View file

@ -0,0 +1,62 @@
# Operating a Cluster
This documentation discusses how to operate a cluster. For related, more specific guides
see [Packaging](packaging.md), [Deploying](deploying.md) and [Rolling Updates](rolling-updates.md).
## Starting
### Cluster Bootstrap
When starting clusters on cloud systems such as Kubernetes, AWS, Google Cloud, Azure, Mesos or others,
you may want to automate the discovery of nodes for the cluster joining process, using your cloud providers,
cluster orchestrator, or other form of service discovery (such as managed DNS).
The open source Akka Management library includes the [Cluster Bootstrap](https://doc.akka.io/docs/akka-management/current/bootstrap/index.html)
module which handles just that. Please refer to its documentation for more details.
@@@ note
If you are running Akka in a Docker container or the nodes for some other reason have separate internal and
external ip addresses you must configure remoting according to @ref:[Akka behind NAT or in a Docker container](../remoting-artery.md#remote-configuration-nat-artery)
@@@
## Stopping
See @ref:[Rolling Updates, Cluster Shutdown and Coordinated Shutdown](../additional/rolling-updates.md#cluster-shutdown).
## Cluster Management
There are several management tools for the cluster.
Complete information on running and managing Akka applications can be found in
the [Akka Management](https://doc.akka.io/docs/akka-management/current/) project documentation.
<a id="cluster-http"></a>
### HTTP
Information and management of the cluster is available with a HTTP API.
See documentation of [Akka Management](http://developer.lightbend.com/docs/akka-management/current/).
<a id="cluster-jmx"></a>
### JMX
Information and management of the cluster is available as JMX MBeans with the root name `akka.Cluster`.
The JMX information can be displayed with an ordinary JMX console such as JConsole or JVisualVM.
From JMX you can:
* see what members that are part of the cluster
* see status of this node
* see roles of each member
* join this node to another node in cluster
* mark any node in the cluster as down
* tell any node in the cluster to leave
Member nodes are identified by their address, in format *`akka://actor-system-name@hostname:port`*.
## Monitoring and Observability
Aside from log monitoring and the monitoring provided by your APM or platform provider, [Lightbend Telemetry](https://developer.lightbend.com/docs/telemetry/current/instrumentations/akka/akka.html),
available through a [Lightbend Platform Subscription](https://www.lightbend.com/lightbend-platform-subscription),
can provide additional insights in the run-time characteristics of your application, including metrics, events,
and distributed tracing for Akka Actors, Cluster, HTTP, and more.

View file

@ -86,7 +86,7 @@ in an application composed of multiple JARs to reside under a single package nam
might scan all classes from `com.example.plugins` for specific service implementations with that package existing in might scan all classes from `com.example.plugins` for specific service implementations with that package existing in
several contributed JARs. several contributed JARs.
While it is possible to support overlapping packages with complex manifest headers, it's much better to use non-overlapping While it is possible to support overlapping packages with complex manifest headers, it's much better to use non-overlapping
package spaces and facilities such as @ref:[Akka Cluster](../common/cluster.md) package spaces and facilities such as @ref:[Akka Cluster](../typed/cluster-specification.md)
for service discovery. Stylistically, many organizations opt to use the root package path as the name of the bundle for service discovery. Stylistically, many organizations opt to use the root package path as the name of the bundle
distribution file. distribution file.

View file

@ -63,7 +63,7 @@ Environments such as Kubernetes send a SIGTERM, however if the JVM is wrapped wi
### Ungraceful shutdown ### Ungraceful shutdown
In case of network failures it may still be necessary to set the node's status to Down in order to complete the removal. In case of network failures it may still be necessary to set the node's status to Down in order to complete the removal.
@ref:[Cluster Downing](../cluster-usage.md#downing) details downing nodes and downing providers. @ref:[Cluster Downing](../typed/cluster.md#downing) details downing nodes and downing providers.
[Split Brain Resolver](https://doc.akka.io/docs/akka-enhancements/current/split-brain-resolver.html) can be used to ensure [Split Brain Resolver](https://doc.akka.io/docs/akka-enhancements/current/split-brain-resolver.html) can be used to ensure
the cluster continues to function during network partitions and node failures. For example the cluster continues to function during network partitions and node failures. For example
if there is an unreachability problem Split Brain Resolver would make a decision based on the configured downing strategy. if there is an unreachability problem Split Brain Resolver would make a decision based on the configured downing strategy.
@ -79,7 +79,7 @@ and ensure all nodes are in this state
* Deploy again and with the new nodes set to `akka.cluster.configuration-compatibility-check.enforce-on-join = on`. * Deploy again and with the new nodes set to `akka.cluster.configuration-compatibility-check.enforce-on-join = on`.
Full documentation about enforcing these checks on joining nodes and optionally adding custom checks can be found in Full documentation about enforcing these checks on joining nodes and optionally adding custom checks can be found in
@ref:[Akka Cluster configuration compatibility checks](../cluster-usage.md#configuration-compatibility-check). @ref:[Akka Cluster configuration compatibility checks](../typed/cluster.md#configuration-compatibility-check).
## Rolling Updates and Migrating Akka ## Rolling Updates and Migrating Akka

View file

@ -76,8 +76,8 @@ up a large cluster into smaller groups of nodes for better scalability.
## Membership ## Membership
Some @ref[membership transitions](common/cluster.md#membership-lifecycle) are managed by Some @ref[membership transitions](typed/cluster-membership.md#membership-lifecycle) are managed by
one node called the @ref[leader](common/cluster.md#leader). There is one leader per data center one node called the @ref[leader](typed/cluster-specification.md#leader). There is one leader per data center
and it is responsible for these transitions for the members within the same data center. Members of and it is responsible for these transitions for the members within the same data center. Members of
other data centers are managed independently by the leader of the respective data center. These actions other data centers are managed independently by the leader of the respective data center. These actions
cannot be performed while there are any unreachability observations among the nodes in the data center, cannot be performed while there are any unreachability observations among the nodes in the data center,
@ -91,7 +91,7 @@ User actions like joining, leaving, and downing can be sent to any node in the c
not only to the nodes in the data center of the node. Seed nodes are also global. not only to the nodes in the data center of the node. Seed nodes are also global.
The data center membership is implemented by adding the data center name prefixed with `"dc-"` to the The data center membership is implemented by adding the data center name prefixed with `"dc-"` to the
@ref[roles](cluster-usage.md#node-roles) of the member and thereby this information is known @ref[roles](typed/cluster.md#node-roles) of the member and thereby this information is known
by all other members in the cluster. This is an implementation detail, but it can be good to know by all other members in the cluster. This is an implementation detail, but it can be good to know
if you see this in log messages. if you see this in log messages.
@ -105,7 +105,7 @@ Java
## Failure Detection ## Failure Detection
@ref[Failure detection](cluster-usage.md#failure-detector) is performed by sending heartbeat messages @ref[Failure detection](typed/cluster-specification.md#failure-detector) is performed by sending heartbeat messages
to detect if a node is unreachable. This is done more frequently and with more certainty among to detect if a node is unreachable. This is done more frequently and with more certainty among
the nodes in the same data center than across data centers. The failure detection across different data centers should the nodes in the same data center than across data centers. The failure detection across different data centers should
be interpreted as an indication of problem with the network link between the data centers. be interpreted as an indication of problem with the network link between the data centers.
@ -132,6 +132,8 @@ This influences how rolling upgrades should be performed. Don't stop all of the
at the same time. Stop one or a few at a time so that new nodes can take over the responsibility. at the same time. Stop one or a few at a time so that new nodes can take over the responsibility.
It's best to leave the oldest nodes until last. It's best to leave the oldest nodes until last.
See the @ref:[failure detector](typed/cluster.md#failure-detector) for more details.
## Cluster Singleton ## Cluster Singleton
The @ref[Cluster Singleton](cluster-singleton.md) is a singleton per data center. If you start the The @ref[Cluster Singleton](cluster-singleton.md) is a singleton per data center. If you start the

View file

@ -27,7 +27,7 @@ Cluster metrics information is primarily used for load-balancing routers,
and can also be used to implement advanced metrics-based node life cycles, and can also be used to implement advanced metrics-based node life cycles,
such as "Node Let-it-crash" when CPU steal time becomes excessive. such as "Node Let-it-crash" when CPU steal time becomes excessive.
Cluster members with status @ref:[WeaklyUp](cluster-usage.md#weakly-up), if that feature is enabled, Cluster members with status @ref:[WeaklyUp](typed/cluster-membership.md#weaklyup-members), if that feature is enabled,
will participate in Cluster Metrics collection and dissemination. will participate in Cluster Metrics collection and dissemination.
## Metrics Collector ## Metrics Collector

View file

@ -41,7 +41,7 @@ the sender to know the location of the destination actor. This is achieved by se
the messages via a `ShardRegion` actor provided by this extension, which knows how the messages via a `ShardRegion` actor provided by this extension, which knows how
to route the message with the entity id to the final destination. to route the message with the entity id to the final destination.
Cluster sharding will not be active on members with status @ref:[WeaklyUp](cluster-usage.md#weakly-up) Cluster sharding will not be active on members with status @ref:[WeaklyUp](typed/cluster-membership.md#weaklyup-members)
if that feature is enabled. if that feature is enabled.
@@@ warning @@@ warning
@ -49,7 +49,7 @@ if that feature is enabled.
**Don't use Cluster Sharding together with Automatic Downing**, **Don't use Cluster Sharding together with Automatic Downing**,
since it allows the cluster to split up into two separate clusters, which in turn will result since it allows the cluster to split up into two separate clusters, which in turn will result
in *multiple shards and entities* being started, one in each separate cluster! in *multiple shards and entities* being started, one in each separate cluster!
See @ref:[Downing](cluster-usage.md#automatic-vs-manual-downing). See @ref:[Downing](typed/cluster.md#automatic-vs-manual-downing).
@@@ @@@
@ -492,7 +492,7 @@ and there was a network partition.
**Don't use Cluster Sharding together with Automatic Downing**, **Don't use Cluster Sharding together with Automatic Downing**,
since it allows the cluster to split up into two separate clusters, which in turn will result since it allows the cluster to split up into two separate clusters, which in turn will result
in *multiple shards and entities* being started, one in each separate cluster! in *multiple shards and entities* being started, one in each separate cluster!
See @ref:[Downing](cluster-usage.md#automatic-vs-manual-downing). See @ref:[Downing](typed/cluster.md#automatic-vs-manual-downing).
@@@ @@@

View file

@ -44,7 +44,7 @@ The oldest member is determined by `akka.cluster.Member#isOlderThan`.
This can change when removing that member from the cluster. Be aware that there is a short time This can change when removing that member from the cluster. Be aware that there is a short time
period when there is no active singleton during the hand-over process. period when there is no active singleton during the hand-over process.
The cluster failure detector will notice when oldest node becomes unreachable due to The cluster @ref:[failure detector](typed/cluster.md#failure-detector) will notice when oldest node becomes unreachable due to
things like JVM crash, hard shut down, or network failure. Then a new oldest node will things like JVM crash, hard shut down, or network failure. Then a new oldest node will
take over and a new singleton actor is created. For these failure scenarios there will take over and a new singleton actor is created. For these failure scenarios there will
not be a graceful hand-over, but more than one active singletons is prevented by all not be a graceful hand-over, but more than one active singletons is prevented by all
@ -65,7 +65,7 @@ It's worth noting that messages can always be lost because of the distributed na
As always, additional logic should be implemented in the singleton (acknowledgement) and in the As always, additional logic should be implemented in the singleton (acknowledgement) and in the
client (retry) actors to ensure at-least-once message delivery. client (retry) actors to ensure at-least-once message delivery.
The singleton instance will not run on members with status @ref:[WeaklyUp](cluster-usage.md#weakly-up). The singleton instance will not run on members with status @ref:[WeaklyUp](typed/cluster-membership.md#weaklyup-members).
## Potential problems to be aware of ## Potential problems to be aware of
@ -75,7 +75,7 @@ This pattern may seem to be very tempting to use at first, but it has several dr
* you can not rely on the cluster singleton to be *non-stop* available — e.g. when the node on which the singleton has * you can not rely on the cluster singleton to be *non-stop* available — e.g. when the node on which the singleton has
been running dies, it will take a few seconds for this to be noticed and the singleton be migrated to another node, been running dies, it will take a few seconds for this to be noticed and the singleton be migrated to another node,
* in the case of a *network partition* appearing in a Cluster that is using Automatic Downing (see docs for * in the case of a *network partition* appearing in a Cluster that is using Automatic Downing (see docs for
@ref:[Auto Downing](cluster-usage.md#automatic-vs-manual-downing)), @ref:[Auto Downing](typed/cluster.md#automatic-vs-manual-downing)),
it may happen that the isolated clusters each decide to spin up their own singleton, meaning that there might be multiple it may happen that the isolated clusters each decide to spin up their own singleton, meaning that there might be multiple
singletons running in the system, yet the Clusters have no way of finding out about them (because of the partition). singletons running in the system, yet the Clusters have no way of finding out about them (because of the partition).

View file

@ -1,18 +1,22 @@
# Cluster Usage # Classic Cluster Usage
This document describes how to use Akka Cluster and the Cluster APIs using code samples.
For specific documentation topics see:
For introduction to the Akka Cluster concepts please see @ref:[Cluster Specification](common/cluster.md). * @ref:[Cluster Specification](typed/cluster-specification.md)
* @ref:[Cluster Membership Service](typed/cluster-membership.md)
The core of Akka Cluster is the cluster membership, to keep track of what nodes are part of the cluster and * @ref:[When and where to use Akka Cluster](typed/choosing-cluster.md)
their health. There are several @ref:[Higher level Cluster tools](cluster-usage.md#higher-level-cluster-tools) that are built * @ref:[Higher level Cluster tools](#higher-level-cluster-tools)
on top of the cluster membership. * @ref:[Rolling Updates](additional/rolling-updates.md)
* @ref:[Operating, Managing, Observability](additional/operations.md)
You need to enable @ref:[serialization](serialization.md) for your actor messages.
@ref:[Serialization with Jackson](serialization-jackson.md) is a good choice in many cases and our Enable @ref:[serialization](serialization.md) to send events between ActorSystems and systems
recommendation if you don't have other preference. external to the Cluster. @ref:[Serialization with Jackson](serialization-jackson.md) is a good choice in many cases, and our
recommendation if you don't have other preferences or constraints.
## Dependency ## Dependency
To use Akka Cluster, you must add the following dependency in your project: To use Akka Cluster add the following dependency in your project:
@@dependency[sbt,Maven,Gradle] { @@dependency[sbt,Maven,Gradle] {
group="com.typesafe.akka" group="com.typesafe.akka"
@ -20,135 +24,28 @@ To use Akka Cluster, you must add the following dependency in your project:
version="$akka.version$" version="$akka.version$"
} }
## Sample project ## Cluster samples
You can look at the To see what using Akka Cluster looks like in practice, see the
@java[@extref[Cluster example project](samples:akka-samples-cluster-java)] @java[@extref[Cluster example project](samples:akka-samples-cluster-java)]
@scala[@extref[Cluster example project](samples:akka-samples-cluster-scala)] @scala[@extref[Cluster example project](samples:akka-samples-cluster-scala)].
to see what this looks like in practice. This project contains samples illustrating different features, such as
subscribing to cluster membership events, sending messages to actors running on nodes in the cluster,
and using Cluster aware routers.
The easiest way to run this example yourself is to try the
@scala[@extref[Akka Cluster Sample with Scala](samples:akka-samples-cluster-scala)]@java[@extref[Akka Cluster Sample with Java](samples:akka-samples-cluster-java)].
It contains instructions on how to run the `SimpleClusterApp`.
## When and where to use Akka Cluster ## When and where to use Akka Cluster
See [Choosing Akka Cluster](typed/choosing-cluster.md#when-and-where-to-use-akka-cluster).
An architectural choice you have to make is if you are going to use a microservices architecture or ## Cluster API Extension
a traditional distributed application. This choice will influence how you should use Akka Cluster.
### Microservices
Microservices has many attractive properties, such as the independent nature of microservices allows for
multiple smaller and more focused teams that can deliver new functionality more frequently and can
respond quicker to business opportunities. Reactive Microservices should be isolated, autonomous, and have
a single responsibility as identified by Jonas Bonér in the book
[Reactive Microsystems: The Evolution of Microservices at Scale](https://info.lightbend.com/ebook-reactive-microservices-the-evolution-of-microservices-at-scale-register.html).
In a microservices architecture, you should consider communication within a service and between services.
In general we recommend against using Akka Cluster and actor messaging between _different_ services because that
would result in a too tight code coupling between the services and difficulties deploying these independent of
each other, which is one of the main reasons for using a microservices architecture.
See the discussion on
@scala[[Internal and External Communication](https://www.lagomframework.com/documentation/current/scala/InternalAndExternalCommunication.html)]
@java[[Internal and External Communication](https://www.lagomframework.com/documentation/current/java/InternalAndExternalCommunication.html)]
in the docs of the [Lagom Framework](https://www.lagomframework.com) (where each microservice is an Akka Cluster)
for some background on this.
Nodes of a single service (collectively called a cluster) require less decoupling. They share the same code and
are deployed together, as a set, by a single team or individual. There might be two versions running concurrently
during a rolling deployment, but deployment of the entire set has a single point of control. For this reason,
intra-service communication can take advantage of Akka Cluster, failure management and actor messaging, which
is convenient to use and has great performance.
Between different services [Akka HTTP](https://doc.akka.io/docs/akka-http/current) or
[Akka gRPC](https://doc.akka.io/docs/akka-grpc/current/) can be used for synchronous (yet non-blocking)
communication and [Akka Streams Kafka](https://doc.akka.io/docs/akka-stream-kafka/current/home.html) or other
[Alpakka](https://doc.akka.io/docs/alpakka/current/) connectors for integration asynchronous communication.
All those communication mechanisms work well with streaming of messages with end-to-end back-pressure, and the
synchronous communication tools can also be used for single request response interactions. It is also important
to note that when using these tools both sides of the communication do not have to be implemented with Akka,
nor does the programming language matter.
### Traditional distributed application
We acknowledge that microservices also introduce many new challenges and it's not the only way to
build applications. A traditional distributed application may have less complexity and work well in many cases.
For example for a small startup, with a single team, building an application where time to market is everything.
Akka Cluster can efficiently be used for building such distributed application.
In this case, you have a single deployment unit, built from a single code base (or using traditional binary
dependency management to modularize) but deployed across many nodes using a single cluster.
Tighter coupling is OK, because there is a central point of deployment and control. In some cases, nodes may
have specialized runtime roles which means that the cluster is not totally homogenous (e.g., "front-end" and
"back-end" nodes, or dedicated master/worker nodes) but if these are run from the same built artifacts this
is just a runtime behavior and doesn't cause the same kind of problems you might get from tight coupling of
totally separate artifacts.
A tightly coupled distributed application has served the industry and many Akka users well for years and is
still a valid choice.
### Distributed monolith
There is also an anti-pattern that is sometimes called "distributed monolith". You have multiple services
that are built and deployed independently from each other, but they have a tight coupling that makes this
very risky, such as a shared cluster, shared code and dependencies for service API calls, or a shared
database schema. There is a false sense of autonomy because of the physical separation of the code and
deployment units, but you are likely to encounter problems because of changes in the implementation of
one service leaking into the behavior of others. See Ben Christensen's
[Dont Build a Distributed Monolith](https://www.microservices.com/talks/dont-build-a-distributed-monolith/).
Organizations that find themselves in this situation often react by trying to centrally coordinate deployment
of multiple services, at which point you have lost the principal benefit of microservices while taking on
the costs. You are in a halfway state with things that aren't really separable being built and deployed
in a separate way. Some people do this, and some manage to make it work, but it's not something we would
recommend and it needs to be carefully managed.
## A Simple Cluster Example
The following configuration enables the `Cluster` extension to be used. The following configuration enables the `Cluster` extension to be used.
It joins the cluster and an actor subscribes to cluster membership events and logs them. It joins the cluster and an actor subscribes to cluster membership events and logs them.
The `application.conf` configuration looks like this:
```
akka {
actor {
provider = "cluster"
}
remote.artery {
canonical {
hostname = "127.0.0.1"
port = 2551
}
}
cluster {
seed-nodes = [
"akka://ClusterSystem@127.0.0.1:2551",
"akka://ClusterSystem@127.0.0.1:2552"]
# auto downing is NOT safe for production deployments.
# you may want to use it during development, read more about it in the docs.
#
# auto-down-unreachable-after = 10s
}
}
```
To enable cluster capabilities in your Akka project you should, at a minimum, add the @ref:[Remoting](remoting-artery.md)
settings, but with `cluster`.
The `akka.cluster.seed-nodes` should normally also be added to your `application.conf` file.
@@@ note
If you are running Akka in a Docker container or the nodes for some other reason have separate internal and
external ip addresses you must configure remoting according to @ref:[Akka behind NAT or in a Docker container](remoting-artery.md#remote-configuration-nat-artery)
@@@
The seed nodes are configured contact points for initial, automatic, join of the cluster.
Note that if you are going to start the nodes on different machines you need to specify the
ip-addresses or host names of the machines in `application.conf` instead of `127.0.0.1`
An actor that uses the cluster extension may look like this: An actor that uses the cluster extension may look like this:
Scala Scala
@ -157,80 +54,44 @@ Scala
Java Java
: @@snip [SimpleClusterListener.java](/akka-docs/src/test/java/jdocs/cluster/SimpleClusterListener.java) { type=java } : @@snip [SimpleClusterListener.java](/akka-docs/src/test/java/jdocs/cluster/SimpleClusterListener.java) { type=java }
And the minimum configuration required is to set a host/port for remoting and the `akka.actor.provider = "cluster"`.
@@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #config-seeds }
The actor registers itself as subscriber of certain cluster events. It receives events corresponding to the current state The actor registers itself as subscriber of certain cluster events. It receives events corresponding to the current state
of the cluster when the subscription starts and then it receives events for changes that happen in the cluster. of the cluster when the subscription starts and then it receives events for changes that happen in the cluster.
The easiest way to run this example yourself is to try the ## Cluster Membership API
@scala[@extref[Akka Cluster Sample with Scala](samples:akka-samples-cluster-scala)]@java[@extref[Akka Cluster Sample with Java](samples:akka-samples-cluster-java)].
It contains instructions on how to run the `SimpleClusterApp`.
## Joining to Seed Nodes This section shows the basic usage of the membership API. For the in-depth descriptions on joining, joining to seed nodes, downing and leaving of any node in the cluster please see the full
@ref:[Cluster Membership API](typed/cluster.md#cluster-membership-api) documentation.
@@@ note If not using configuration to specify [seed nodes to join](#joining-to-seed-nodes), joining the cluster can be done programmatically:
When starting clusters on cloud systems such as Kubernetes, AWS, Google Cloud, Azure, Mesos or others which maintain
DNS or other ways of discovering nodes, you may want to use the automatic joining process implemented by the open source
[Akka Cluster Bootstrap](https://doc.akka.io/docs/akka-management/current/bootstrap/index.html) module.
@@@
### Joining configured seed nodes Scala
: @@snip [SimpleClusterListener2.scala](/akka-docs/src/test/scala/docs/cluster/SimpleClusterListener2.scala) { #join }
You may decide if joining to the cluster should be done manually or automatically Java
to configured initial contact points, so-called seed nodes. After the joining process : @@snip [SimpleClusterListener2.java](/akka-docs/src/test/java/jdocs/cluster/SimpleClusterListener2.java) { #join }
the seed nodes are not special and they participate in the cluster in exactly the same
way as other nodes.
When a new node is started it sends a message to all seed nodes and then sends join command to the one that
answers first. If no one of the seed nodes replied (might not be started yet)
it retries this procedure until successful or shutdown.
You define the seed nodes in the [configuration](#cluster-configuration) file (application.conf): Leaving the cluster and downing a node are similar, for example:
Scala
: @@snip [ClusterDocSpec.scala](/akka-docs/src/test/scala/docs/cluster/ClusterDocSpec.scala) { #leave }
``` Java
akka.cluster.seed-nodes = [ : @@snip [ClusterDocTest.java](/akka-docs/src/test/java/jdocs/cluster/ClusterDocTest.java) { #leave }
"akka://ClusterSystem@host1:2552",
"akka://ClusterSystem@host2:2552"]
``` ### Joining to Seed Nodes
This can also be defined as Java system properties when starting the JVM using the following syntax: Joining to initial cluster contact points `akka.cluster.seed-nodes` can be done manually, with @ref:[configuration](typed/cluster.md#joining-configured-seed-nodes),
[programatically](#programatically-joining-to-seed-nodes-with-joinseednodes), or @ref:[automatically with Cluster Bootstrap](typed/cluster.md#joining-automatically-to-seed-nodes-with-cluster-bootstrap).
#### Programatically joining to seed nodes with `joinSeedNodes`
``` @@include[cluster.md](includes/cluster.md) { #join-seeds-programmatic }
-Dakka.cluster.seed-nodes.0=akka://ClusterSystem@host1:2552
-Dakka.cluster.seed-nodes.1=akka://ClusterSystem@host2:2552
```
The seed nodes can be started in any order and it is not necessary to have all
seed nodes running, but the node configured as the **first element** in the `seed-nodes`
configuration list must be started when initially starting a cluster, otherwise the
other seed-nodes will not become initialized and no other node can join the cluster.
The reason for the special first seed node is to avoid forming separated islands when
starting from an empty cluster.
It is quickest to start all configured seed nodes at the same time (order doesn't matter),
otherwise it can take up to the configured `seed-node-timeout` until the nodes
can join.
Once more than two seed nodes have been started it is no problem to shut down the first
seed node. If the first seed node is restarted, it will first try to join the other
seed nodes in the existing cluster. Note that if you stop all seed nodes at the same time
and restart them with the same `seed-nodes` configuration they will join themselves and
form a new cluster instead of joining remaining nodes of the existing cluster. That is
likely not desired and should be avoided by listing several nodes as seed nodes for redundancy
and don't stop all of them at the same time.
### Automatically joining to seed nodes with Cluster Bootstrap
Instead of manually configuring seed nodes, which is useful in development or statically assigned node IPs, you may want
to automate the discovery of seed nodes using your cloud providers or cluster orchestrator, or some other form of service
discovery (such as managed DNS). The open source Akka Management library includes the
[Cluster Bootstrap](https://doc.akka.io/docs/akka-management/current/bootstrap/index.html) module which handles
just that. Please refer to its documentation for more details.
### Programatically joining to seed nodes with `joinSeedNodes`
You may also use @scala[`Cluster(system).joinSeedNodes`]@java[`Cluster.get(system).joinSeedNodes`] to join programmatically,
which is attractive when dynamically discovering other nodes at startup by using some external tool or API.
When using `joinSeedNodes` you should not include the node itself except for the node that is
supposed to be the first seed node, and that should be placed first in the parameter to
`joinSeedNodes`.
Scala Scala
: @@snip [ClusterDocSpec.scala](/akka-docs/src/test/scala/docs/cluster/ClusterDocSpec.scala) { #join-seed-nodes } : @@snip [ClusterDocSpec.scala](/akka-docs/src/test/scala/docs/cluster/ClusterDocSpec.scala) { #join-seed-nodes }
@ -238,167 +99,7 @@ Scala
Java Java
: @@snip [ClusterDocTest.java](/akka-docs/src/test/java/jdocs/cluster/ClusterDocTest.java) { #join-seed-nodes-imports #join-seed-nodes } : @@snip [ClusterDocTest.java](/akka-docs/src/test/java/jdocs/cluster/ClusterDocTest.java) { #join-seed-nodes-imports #join-seed-nodes }
Unsuccessful attempts to contact seed nodes are automatically retried after the time period defined in For more information see @ref[tuning joins](typed/cluster.md#tuning-joins)
configuration property `seed-node-timeout`. Unsuccessful attempt to join a specific seed node is
automatically retried after the configured `retry-unsuccessful-join-after`. Retrying means that it
tries to contact all seed nodes and then joins the node that answers first. The first node in the list
of seed nodes will join itself if it cannot contact any of the other seed nodes within the
configured `seed-node-timeout`.
The joining of given seed nodes will by default be retried indefinitely until
a successful join. That process can be aborted if unsuccessful by configuring a
timeout. When aborted it will run @ref:[Coordinated Shutdown](actors.md#coordinated-shutdown),
which by default will terminate the ActorSystem. CoordinatedShutdown can also be configured to exit
the JVM. It is useful to define this timeout if the `seed-nodes` are assembled
dynamically and a restart with new seed-nodes should be tried after unsuccessful
attempts.
```
akka.cluster.shutdown-after-unsuccessful-join-seed-nodes = 20s
akka.coordinated-shutdown.terminate-actor-system = on
```
If you don't configure seed nodes or use `joinSeedNodes` you need to join the cluster manually, which can be performed by using [JMX](#cluster-jmx) or [HTTP](#cluster-http).
You can join to any node in the cluster. It does not have to be configured as a seed node.
Note that you can only join to an existing cluster member, which means that for bootstrapping some
node must join itself,and then the following nodes could join them to make up a cluster.
An actor system can only join a cluster once. Additional attempts will be ignored.
When it has successfully joined it must be restarted to be able to join another
cluster or to join the same cluster again. It can use the same host name and port
after the restart, when it come up as new incarnation of existing member in the cluster,
trying to join in, then the existing one will be removed from the cluster and then it will
be allowed to join.
@@@ note
The name of the `ActorSystem` must be the same for all members of a cluster. The name is given when you start the `ActorSystem`.
@@@
<a id="automatic-vs-manual-downing"></a>
## Downing
When a member is considered by the failure detector to be unreachable the
leader is not allowed to perform its duties, such as changing status of
new joining members to 'Up'. The node must first become reachable again, or the
status of the unreachable member must be changed to 'Down'. Changing status to 'Down'
can be performed automatically or manually. By default it must be done manually, using
[JMX](#cluster-jmx) or [HTTP](#cluster-http).
It can also be performed programmatically with @scala[`Cluster(system).down(address)`]@java[`Cluster.get(system).down(address)`].
If a node is still running and sees its self as Down it will shutdown. @ref:[Coordinated Shutdown](actors.md#coordinated-shutdown) will automatically
run if `run-coordinated-shutdown-when-down` is set to `on` (the default) however the node will not try
and leave the cluster gracefully so sharding and singleton migration will not occur.
A pre-packaged solution for the downing problem is provided by
[Split Brain Resolver](http://developer.lightbend.com/docs/akka-commercial-addons/current/split-brain-resolver.html),
which is part of the [Lightbend Reactive Platform](http://www.lightbend.com/platform).
If you dont use RP, you should anyway carefully read the [documentation](http://developer.lightbend.com/docs/akka-commercial-addons/current/split-brain-resolver.html)
of the Split Brain Resolver and make sure that the solution you are using handles the concerns
described there.
### Auto-downing (DO NOT USE)
There is an automatic downing feature that you should not use in production. For testing purpose you can enable it with configuration:
```
akka.cluster.auto-down-unreachable-after = 120s
```
This means that the cluster leader member will change the `unreachable` node
status to `down` automatically after the configured time of unreachability.
This is a naïve approach to remove unreachable nodes from the cluster membership.
It can be useful during development but in a production environment it will eventually breakdown the cluster. When a network partition occurs, both sides of the
partition will see the other side as unreachable and remove it from the cluster.
This results in the formation of two separate, disconnected, clusters
(known as *Split Brain*).
This behaviour is not limited to network partitions. It can also occur if a node
in the cluster is overloaded, or experiences a long GC pause.
@@@ warning
We recommend against using the auto-down feature of Akka Cluster in production. It
has multiple undesirable consequences for production systems.
If you are using @ref:[Cluster Singleton](cluster-singleton.md) or
@ref:[Cluster Sharding](cluster-sharding.md) it can break the contract provided by
those features. Both provide a guarantee that an actor will be unique in a cluster.
With the auto-down feature enabled, it is possible for multiple independent clusters
to form (*Split Brain*). When this happens the guaranteed uniqueness will no
longer be true resulting in undesirable behaviour in the system.
This is even more severe when @ref:[Akka Persistence](persistence.md) is used in
conjunction with Cluster Sharding. In this case, the lack of unique actors can
cause multiple actors to write to the same journal. Akka Persistence operates on a
single writer principle. Having multiple writers will corrupt the journal
and make it unusable.
Finally, even if you don't use features such as Persistence, Sharding, or Singletons,
auto-downing can lead the system to form multiple small clusters. These small
clusters will be independent from each other. They will be unable to communicate
and as a result you may experience performance degradation. Once this condition
occurs, it will require manual intervention in order to reform the cluster.
Because of these issues, auto-downing should **never** be used in a production environment.
@@@
## Leaving
There are two ways to remove a member from the cluster.
You can stop the actor system (or the JVM process). It will be detected
as unreachable and removed after the automatic or manual downing as described
above.
A more graceful exit can be performed if you tell the cluster that a node shall leave.
This can be performed using [JMX](#cluster-jmx) or [HTTP](#cluster-http).
It can also be performed programmatically with:
Scala
: @@snip [ClusterDocSpec.scala](/akka-docs/src/test/scala/docs/cluster/ClusterDocSpec.scala) { #leave }
Java
: @@snip [ClusterDocTest.java](/akka-docs/src/test/java/jdocs/cluster/ClusterDocTest.java) { #leave }
Note that this command can be issued to any member in the cluster, not necessarily the
one that is leaving.
The @ref:[Coordinated Shutdown](actors.md#coordinated-shutdown) will automatically run when the cluster node sees itself as
`Exiting`, i.e. leaving from another node will trigger the shutdown process on the leaving node.
Tasks for graceful leaving of cluster including graceful shutdown of Cluster Singletons and
Cluster Sharding are added automatically when Akka Cluster is used, i.e. running the shutdown
process will also trigger the graceful leaving if it's not already in progress.
Normally this is handled automatically, but in case of network failures during this process it might still
be necessary to set the nodes status to `Down` in order to complete the removal.
<a id="weakly-up"></a>
## WeaklyUp Members
If a node is `unreachable` then gossip convergence is not possible and therefore any
`leader` actions are also not possible. However, we still might want new nodes to join
the cluster in this scenario.
`Joining` members will be promoted to `WeaklyUp` and become part of the cluster if
convergence can't be reached. Once gossip convergence is reached, the leader will move `WeaklyUp`
members to `Up`.
This feature is enabled by default, but it can be disabled with configuration option:
```
akka.cluster.allow-weakly-up-members = off
```
You can subscribe to the `WeaklyUp` membership event to make use of the members that are
in this state, but you should be aware of that members on the other side of a network partition
have no knowledge about the existence of the new members. You should for example not count
`WeaklyUp` members in quorum decisions.
<a id="cluster-subscriber"></a> <a id="cluster-subscriber"></a>
## Subscribe to Cluster Events ## Subscribe to Cluster Events
@ -451,26 +152,6 @@ Scala
Java Java
: @@snip [SimpleClusterListener.java](/akka-docs/src/test/java/jdocs/cluster/SimpleClusterListener.java) { #subscribe } : @@snip [SimpleClusterListener.java](/akka-docs/src/test/java/jdocs/cluster/SimpleClusterListener.java) { #subscribe }
The events to track the life-cycle of members are:
* `ClusterEvent.MemberJoined` - A new member has joined the cluster and its status has been changed to `Joining`
* `ClusterEvent.MemberUp` - A new member has joined the cluster and its status has been changed to `Up`
* `ClusterEvent.MemberExited` - A member is leaving the cluster and its status has been changed to `Exiting`
Note that the node might already have been shutdown when this event is published on another node.
* `ClusterEvent.MemberRemoved` - Member completely removed from the cluster.
* `ClusterEvent.UnreachableMember` - A member is considered as unreachable, detected by the failure detector
of at least one other node.
* `ClusterEvent.ReachableMember` - A member is considered as reachable again, after having been unreachable.
All nodes that previously detected it as unreachable has detected it as reachable again.
There are more types of change events, consult the API documentation
of classes that extends `akka.cluster.ClusterEvent.ClusterDomainEvent`
for details about the events.
Instead of subscribing to cluster events it can sometimes be convenient to only get the full membership state with
@scala[`Cluster(system).state`]@java[`Cluster.get(system).state()`]. Note that this state is not necessarily in sync with the events published to a
cluster subscription.
### Worker Dial-in Example ### Worker Dial-in Example
Let's take a look at an example that illustrates how workers, here named *backend*, Let's take a look at an example that illustrates how workers, here named *backend*,
@ -520,17 +201,10 @@ unreachable cluster node has been downed and removed.
The easiest way to run **Worker Dial-in Example** example yourself is to try the The easiest way to run **Worker Dial-in Example** example yourself is to try the
@scala[@extref[Akka Cluster Sample with Scala](samples:akka-samples-cluster-scala)]@java[@extref[Akka Cluster Sample with Java](samples:akka-samples-cluster-java)]. @scala[@extref[Akka Cluster Sample with Scala](samples:akka-samples-cluster-scala)]@java[@extref[Akka Cluster Sample with Java](samples:akka-samples-cluster-java)].
It contains instructions on how to run the **Worker Dial-in Example** sample. It contains instructions on how to run the **Worker Dial-in Example** sample.
## Node Roles ## Node Roles
Not all nodes of a cluster need to perform the same function: there might be one sub-set which runs the web front-end, See @ref:[Cluster Node Roles](typed/cluster.md#node-roles)
one which runs the data access layer and one for the number-crunching. Deployment of actors—for example by cluster-aware
routers—can take node roles into account to achieve this distribution of responsibilities.
The roles of a node is defined in the configuration property named `akka.cluster.roles`
and it is typically defined in the start script as a system property or environment variable.
The roles of the nodes is part of the membership information in `MemberEvent` that you can subscribe to.
<a id="min-members"></a> <a id="min-members"></a>
## How To Startup when Cluster Size Reached ## How To Startup when Cluster Size Reached
@ -555,9 +229,11 @@ akka.cluster.role {
} }
``` ```
You can start the actors in a `registerOnMemberUp` callback, which will ## How To Startup when Member is Up
be invoked when the current member status is changed to 'Up', i.e. the cluster
has at least the defined number of members. You can start actors or trigger any functions using the `registerOnMemberUp` callback, which will
be invoked when the current member status is changed to 'Up'. This can additionally be used with
`akka.cluster.min-nr-of-members` optional configuration to defer an action until the cluster has reached a certain size.
Scala Scala
: @@snip [FactorialFrontend.scala](/akka-docs/src/test/scala/docs/cluster/FactorialFrontend.scala) { #registerOnUp } : @@snip [FactorialFrontend.scala](/akka-docs/src/test/scala/docs/cluster/FactorialFrontend.scala) { #registerOnUp }
@ -565,8 +241,6 @@ Scala
Java Java
: @@snip [FactorialFrontendMain.java](/akka-docs/src/test/java/jdocs/cluster/FactorialFrontendMain.java) { #registerOnUp } : @@snip [FactorialFrontendMain.java](/akka-docs/src/test/java/jdocs/cluster/FactorialFrontendMain.java) { #registerOnUp }
This callback can be used for other things than starting actors.
## How To Cleanup when Member is Removed ## How To Cleanup when Member is Removed
You can do some clean up in a `registerOnMemberRemoved` callback, which will You can do some clean up in a `registerOnMemberRemoved` callback, which will
@ -585,139 +259,55 @@ down when you installing, and depending on the race is not healthy.
## Higher level Cluster tools ## Higher level Cluster tools
### Cluster Singleton @@include[cluster.md](includes/cluster.md) { #cluster-singleton }
See @ref:[Cluster Singleton](cluster-singleton.md).
For some use cases it is convenient and sometimes also mandatory to ensure that
you have exactly one actor of a certain type running somewhere in the cluster.
This can be implemented by subscribing to member events, but there are several corner
cases to consider. Therefore, this specific use case is covered by the
@ref:[Cluster Singleton](cluster-singleton.md).
### Cluster Sharding
Distributes actors across several nodes in the cluster and supports interaction
with the actors using their logical identifier, but without having to care about
their physical location in the cluster.
@@include[cluster.md](includes/cluster.md) { #cluster-sharding }
See @ref:[Cluster Sharding](cluster-sharding.md). See @ref:[Cluster Sharding](cluster-sharding.md).
@@include[cluster.md](includes/cluster.md) { #cluster-ddata }
See @ref:[Distributed Data](distributed-data.md).
### Distributed Publish Subscribe @@include[cluster.md](includes/cluster.md) { #cluster-pubsub }
See @ref:[Cluster Distributed Publish Subscribe](distributed-pub-sub.md).
Publish-subscribe messaging between actors in the cluster, and point-to-point messaging ### Cluster Aware Routers
using the logical path of the actors, i.e. the sender does not have to know on which
node the destination actor is running.
See @ref:[Distributed Publish Subscribe in Cluster](distributed-pub-sub.md).
All routers can be made aware of member nodes in the cluster, i.e.
deploying new routees or looking up routees on nodes in the cluster.
When a node becomes unreachable or leaves the cluster the routees of that node are
automatically unregistered from the router. When new nodes join the cluster, additional
routees are added to the router, according to the configuration.
See @ref:[Cluster Aware Routers](cluster-routing.md) and @ref:[Routers](routing.md).
@@include[cluster.md](includes/cluster.md) { #cluster-multidc }
See @ref:[Cluster Multi-DC](cluster-dc.md).
### Cluster Client ### Cluster Client
Communication from an actor system that is not part of the cluster to actors running Communication from an actor system that is not part of the cluster to actors running
somewhere in the cluster. The client does not have to know on which node the destination somewhere in the cluster. The client does not have to know on which node the destination
actor is running. actor is running.
See @ref:[Cluster Client](cluster-client.md). See @ref:[Cluster Client](cluster-client.md).
### Distributed Data
*Akka Distributed Data* is useful when you need to share data between nodes in an
Akka Cluster. The data is accessed with an actor providing a key-value store like API.
See @ref:[Distributed Data](distributed-data.md).
### Cluster Aware Routers
All @ref:[routers](routing.md) can be made aware of member nodes in the cluster, i.e.
deploying new routees or looking up routees on nodes in the cluster.
When a node becomes unreachable or leaves the cluster the routees of that node are
automatically unregistered from the router. When new nodes join the cluster, additional
routees are added to the router, according to the configuration.
See @ref:[Cluster Aware Routers](cluster-routing.md).
### Cluster Metrics ### Cluster Metrics
The member nodes of the cluster can collect system health metrics and publish that to other cluster nodes The member nodes of the cluster can collect system health metrics and publish that to other cluster nodes
and to the registered subscribers on the system event bus. and to the registered subscribers on the system event bus.
See @ref:[Cluster Metrics](cluster-metrics.md). See @ref:[Cluster Metrics](cluster-metrics.md).
## Failure Detector ## Failure Detector
In a cluster each node is monitored by a few (default maximum 5) other nodes, and when
any of these detects the node as `unreachable` that information will spread to
the rest of the cluster through the gossip. In other words, only one node needs to
mark a node `unreachable` to have the rest of the cluster mark that node `unreachable`.
The failure detector will also detect if the node becomes `reachable` again. When
all nodes that monitored the `unreachable` node detects it as `reachable` again
the cluster, after gossip dissemination, will consider it as `reachable`.
If system messages cannot be delivered to a node it will be quarantined and then it
cannot come back from `unreachable`. This can happen if the there are too many
unacknowledged system messages (e.g. watch, Terminated, remote actor deployment,
failures of actors supervised by remote parent). Then the node needs to be moved
to the `down` or `removed` states and the actor system of the quarantined node
must be restarted before it can join the cluster again.
The nodes in the cluster monitor each other by sending heartbeats to detect if a node is The nodes in the cluster monitor each other by sending heartbeats to detect if a node is
unreachable from the rest of the cluster. The heartbeat arrival times is interpreted unreachable from the rest of the cluster. Please see:
by an implementation of
[The Phi Accrual Failure Detector](http://www.jaist.ac.jp/~defago/files/pdf/IS_RR_2004_010.pdf).
The suspicion level of failure is given by a value called *phi*.
The basic idea of the phi failure detector is to express the value of *phi* on a scale that
is dynamically adjusted to reflect current network conditions.
The value of *phi* is calculated as:
```
phi = -log10(1 - F(timeSinceLastHeartbeat))
```
where F is the cumulative distribution function of a normal distribution with mean
and standard deviation estimated from historical heartbeat inter-arrival times.
In the [configuration](#cluster-configuration) you can adjust the `akka.cluster.failure-detector.threshold`
to define when a *phi* value is considered to be a failure.
A low `threshold` is prone to generate many false positives but ensures
a quick detection in the event of a real crash. Conversely, a high `threshold`
generates fewer mistakes but needs more time to detect actual crashes. The
default `threshold` is 8 and is appropriate for most situations. However in
cloud environments, such as Amazon EC2, the value could be increased to 12 in
order to account for network issues that sometimes occur on such platforms.
The following chart illustrates how *phi* increase with increasing time since the
previous heartbeat.
![phi1.png](./images/phi1.png)
Phi is calculated from the mean and standard deviation of historical
inter arrival times. The previous chart is an example for standard deviation
of 200 ms. If the heartbeats arrive with less deviation the curve becomes steeper,
i.e. it is possible to determine failure more quickly. The curve looks like this for
a standard deviation of 100 ms.
![phi2.png](./images/phi2.png)
To be able to survive sudden abnormalities, such as garbage collection pauses and
transient network failures the failure detector is configured with a margin,
`akka.cluster.failure-detector.acceptable-heartbeat-pause`. You may want to
adjust the [configuration](#cluster-configuration) of this depending on your environment.
This is how the curve looks like for `acceptable-heartbeat-pause` configured to
3 seconds.
![phi3.png](./images/phi3.png)
Death watch uses the cluster failure detector for nodes in the cluster, i.e. it detects
network failures and JVM crashes, in addition to graceful termination of watched
actor. Death watch generates the `Terminated` message to the watching actor when the
unreachable cluster node has been downed and removed.
If you encounter suspicious false positives when the system is under load you should
define a separate dispatcher for the cluster actors as described in [Cluster Dispatcher](#cluster-dispatcher).
* @ref:[Failure Detector specification](typed/cluster-specification.md#failure-detector)
* @ref:[Phi Accrual Failure Detector](typed/failure-detector.md) implementation
* [Using the Failure Detector](typed/cluster.md#using-the-failure-detector)
## How to Test ## How to Test
@@@ div { .group-scala } @@@ div { .group-scala }
@ -728,7 +318,7 @@ Set up your project according to the instructions in @ref:[Multi Node Testing](m
add the `sbt-multi-jvm` plugin and the dependency to `akka-multi-node-testkit`. add the `sbt-multi-jvm` plugin and the dependency to `akka-multi-node-testkit`.
First, as described in @ref:[Multi Node Testing](multi-node-testing.md), we need some scaffolding to configure the `MultiNodeSpec`. First, as described in @ref:[Multi Node Testing](multi-node-testing.md), we need some scaffolding to configure the `MultiNodeSpec`.
Define the participating roles and their [configuration](#cluster-configuration) in an object extending `MultiNodeConfig`: Define the participating @ref:[roles](typed/cluster.md#node-roles) and their [configuration](#configuration) in an object extending `MultiNodeConfig`:
@@snip [StatsSampleSpec.scala](/akka-cluster-metrics/src/multi-jvm/scala/akka/cluster/metrics/sample/StatsSampleSpec.scala) { #MultiNodeConfig } @@snip [StatsSampleSpec.scala](/akka-cluster-metrics/src/multi-jvm/scala/akka/cluster/metrics/sample/StatsSampleSpec.scala) { #MultiNodeConfig }
@ -783,29 +373,9 @@ Go to the corresponding Scala version of this page for details.
## Management ## Management
<a id="cluster-http"></a> There are several management tools for the cluster. Please refer to the
### HTTP @ref:[Cluster Management](additional/operations.md#cluster-management) for more information.
Information and management of the cluster is available with a HTTP API.
See documentation of [Akka Management](http://developer.lightbend.com/docs/akka-management/current/).
<a id="cluster-jmx"></a>
### JMX
Information and management of the cluster is available as JMX MBeans with the root name `akka.Cluster`.
The JMX information can be displayed with an ordinary JMX console such as JConsole or JVisualVM.
From JMX you can:
* see what members that are part of the cluster
* see status of this node
* see roles of each member
* join this node to another node in cluster
* mark any node in the cluster as down
* tell any node in the cluster to leave
Member nodes are identified by their address, in format *akka.**protocol**://**actor-system-name**@**hostname**:**port***.
<a id="cluster-command-line"></a> <a id="cluster-command-line"></a>
### Command Line ### Command Line
@ -850,52 +420,8 @@ as described in [Monitoring and Management Using JMX Technology](http://docs.ora
Make sure you understand the security implications of enabling remote monitoring and management. Make sure you understand the security implications of enabling remote monitoring and management.
<a id="cluster-configuration"></a> <a id="cluster-configuration"></a>
## Configuration ## Configuration
There are several configuration properties for the cluster. We refer to the There are several @ref:[configuration](typed/cluster.md#configuration) properties for the cluster,
@ref:[reference configuration](general/configuration.md#config-akka-cluster) for more information. and the full @ref:[reference configuration](general/configuration.md#config-akka-cluster) for complete information.
### Cluster Info Logging
You can silence the logging of cluster events at info level with configuration property:
```
akka.cluster.log-info = off
```
You can enable verbose logging of cluster events at info level, e.g. for temporary troubleshooting, with configuration property:
```
akka.cluster.log-info-verbose = on
```
### Cluster Dispatcher
Under the hood the cluster extension is implemented with actors. To protect them against
disturbance from user actors they are by default run on the internal dispatcher configured
under `akka.actor.internal-dispatcher`. The cluster actors can potentially be isolated even
further onto their own dispatcher using the setting `akka.cluster.use-dispatcher`.
### Configuration Compatibility Check
Creating a cluster is about deploying two or more nodes and make then behave as if they were one single application. Therefore it's extremely important that all nodes in a cluster are configured with compatible settings.
The Configuration Compatibility Check feature ensures that all nodes in a cluster have a compatible configuration. Whenever a new node is joining an existing cluster, a subset of its configuration settings (only those that are required to be checked) is sent to the nodes in the cluster for verification. Once the configuration is checked on the cluster side, the cluster sends back its own set of required configuration settings. The joining node will then verify if it's compliant with the cluster configuration. The joining node will only proceed if all checks pass, on both sides.
New custom checkers can be added by extending `akka.cluster.JoinConfigCompatChecker` and including them in the configuration. Each checker must be associated with a unique key:
```
akka.cluster.configuration-compatibility-check.checkers {
my-custom-config = "com.company.MyCustomJoinConfigCompatChecker"
}
```
@@@ note
Configuration Compatibility Check is enabled by default, but can be disabled by setting `akka.cluster.configuration-compatibility-check.enforce-on-join = off`. This is specially useful when performing rolling updates. Obviously this should only be done if a complete cluster shutdown isn't an option. A cluster with nodes with different configuration settings may lead to data loss or data corruption.
This setting should only be disabled on the joining nodes. The checks are always performed on both sides, and warnings are logged. In case of incompatibilities, it is the responsibility of the joining node to decide if the process should be interrupted or not.
If you are performing a rolling update on cluster using Akka 2.5.9 or prior (thus, not supporting this feature), the checks will not be performed because the running cluster has no means to verify the configuration sent by the joining node, nor to send back its own configuration.
@@@

View file

@ -1,312 +0,0 @@
# Cluster Specification
@@@ note
This document describes the design concepts of Akka Cluster.
@@@
## Intro
Akka Cluster provides a fault-tolerant decentralized peer-to-peer based cluster
[membership](#membership) service with no single point of failure or single point of bottleneck.
It does this using [gossip](#gossip) protocols and an automatic [failure detector](#failure-detector).
Akka cluster allows for building distributed applications, where one application or service spans multiple nodes
(in practice multiple `ActorSystem`s). See also the discussion in
@ref:[When and where to use Akka Cluster](../cluster-usage.md#when-and-where-to-use-akka-cluster).
## Terms
**node**
: A logical member of a cluster. There could be multiple nodes on a physical
machine. Defined by a *hostname:port:uid* tuple.
**cluster**
: A set of nodes joined together through the [membership](#membership) service.
**leader**
: A single node in the cluster that acts as the leader. Managing cluster convergence
and membership state transitions.
## Membership
A cluster is made up of a set of member nodes. The identifier for each node is a
`hostname:port:uid` tuple. An Akka application can be distributed over a cluster with
each node hosting some part of the application. Cluster membership and the actors running
on that node of the application are decoupled. A node could be a member of a
cluster without hosting any actors. Joining a cluster is initiated
by issuing a `Join` command to one of the nodes in the cluster to join.
The node identifier internally also contains a UID that uniquely identifies this
actor system instance at that `hostname:port`. Akka uses the UID to be able to
reliably trigger remote death watch. This means that the same actor system can never
join a cluster again once it's been removed from that cluster. To re-join an actor
system with the same `hostname:port` to a cluster you have to stop the actor system
and start a new one with the same `hostname:port` which will then receive a different
UID.
The cluster membership state is a specialized [CRDT](http://hal.upmc.fr/docs/00/55/55/88/PDF/techreport.pdf), which means that it has a monotonic
merge function. When concurrent changes occur on different nodes the updates can always be
merged and converge to the same end result.
### Gossip
The cluster membership used in Akka is based on Amazon's [Dynamo](http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf) system and
particularly the approach taken in Basho's' [Riak](http://basho.com/technology/architecture/) distributed database.
Cluster membership is communicated using a [Gossip Protocol](http://en.wikipedia.org/wiki/Gossip_protocol), where the current
state of the cluster is gossiped randomly through the cluster, with preference to
members that have not seen the latest version.
#### Vector Clocks
[Vector clocks](http://en.wikipedia.org/wiki/Vector_clock) are a type of data structure and algorithm for generating a partial
ordering of events in a distributed system and detecting causality violations.
We use vector clocks to reconcile and merge differences in cluster state
during gossiping. A vector clock is a set of (node, counter) pairs. Each update
to the cluster state has an accompanying update to the vector clock.
#### Gossip Convergence
Information about the cluster converges locally at a node at certain points in time.
This is when a node can prove that the cluster state he is observing has been observed
by all other nodes in the cluster. Convergence is implemented by passing a set of nodes
that have seen current state version during gossip. This information is referred to as the
seen set in the gossip overview. When all nodes are included in the seen set there is
convergence.
Gossip convergence cannot occur while any nodes are `unreachable`. The nodes need
to become `reachable` again, or moved to the `down` and `removed` states
(see the [Membership Lifecycle](#membership-lifecycle) section below). This only blocks the leader
from performing its cluster membership management and does not influence the application
running on top of the cluster. For example this means that during a network partition
it is not possible to add more nodes to the cluster. The nodes can join, but they
will not be moved to the `up` state until the partition has healed or the unreachable
nodes have been downed.
#### Failure Detector
The failure detector is responsible for trying to detect if a node is
`unreachable` from the rest of the cluster. For this we are using an
implementation of [The Phi Accrual Failure Detector](https://pdfs.semanticscholar.org/11ae/4c0c0d0c36dc177c1fff5eb84fa49aa3e1a8.pdf) by Hayashibara et al.
An accrual failure detector decouples monitoring and interpretation. That makes
them applicable to a wider area of scenarios and more adequate to build generic
failure detection services. The idea is that it is keeping a history of failure
statistics, calculated from heartbeats received from other nodes, and is
trying to do educated guesses by taking multiple factors, and how they
accumulate over time, into account in order to come up with a better guess if a
specific node is up or down. Rather than only answering "yes" or "no" to the
question "is the node down?" it returns a `phi` value representing the
likelihood that the node is down.
The `threshold` that is the basis for the calculation is configurable by the
user. A low `threshold` is prone to generate many wrong suspicions but ensures
a quick detection in the event of a real crash. Conversely, a high `threshold`
generates fewer mistakes but needs more time to detect actual crashes. The
default `threshold` is 8 and is appropriate for most situations. However in
cloud environments, such as Amazon EC2, the value could be increased to 12 in
order to account for network issues that sometimes occur on such platforms.
In a cluster each node is monitored by a few (default maximum 5) other nodes, and when
any of these detects the node as `unreachable` that information will spread to
the rest of the cluster through the gossip. In other words, only one node needs to
mark a node `unreachable` to have the rest of the cluster mark that node `unreachable`.
The nodes to monitor are picked out of neighbors in a hashed ordered node ring.
This is to increase the likelihood to monitor across racks and data centers, but the order
is the same on all nodes, which ensures full coverage.
Heartbeats are sent out every second and every heartbeat is performed in a request/reply
handshake with the replies used as input to the failure detector.
The failure detector will also detect if the node becomes `reachable` again. When
all nodes that monitored the `unreachable` node detects it as `reachable` again
the cluster, after gossip dissemination, will consider it as `reachable`.
If system messages cannot be delivered to a node it will be quarantined and then it
cannot come back from `unreachable`. This can happen if the there are too many
unacknowledged system messages (e.g. watch, Terminated, remote actor deployment,
failures of actors supervised by remote parent). Then the node needs to be moved
to the `down` or `removed` states (see the [Membership Lifecycle](#membership-lifecycle) section below)
and the actor system must be restarted before it can join the cluster again.
#### Leader
After gossip convergence a `leader` for the cluster can be determined. There is no
`leader` election process, the `leader` can always be recognised deterministically
by any node whenever there is gossip convergence. The leader is only a role, any node
can be the leader and it can change between convergence rounds.
The `leader` is the first node in sorted order that is able to take the leadership role,
where the preferred member states for a `leader` are `up` and `leaving`
(see the [Membership Lifecycle](#membership-lifecycle) section below for more information about member states).
The role of the `leader` is to shift members in and out of the cluster, changing
`joining` members to the `up` state or `exiting` members to the `removed`
state. Currently `leader` actions are only triggered by receiving a new cluster
state with gossip convergence.
The `leader` also has the power, if configured so, to "auto-down" a node that
according to the [Failure Detector](#failure-detector) is considered `unreachable`. This means setting
the `unreachable` node status to `down` automatically after a configured time
of unreachability.
#### Seed Nodes
The seed nodes are configured contact points for new nodes joining the cluster.
When a new node is started it sends a message to all seed nodes and then sends
a join command to the seed node that answers first.
The seed nodes configuration value does not have any influence on the running
cluster itself, it is only relevant for new nodes joining the cluster as it
helps them to find contact points to send the join command to; a new member
can send this command to any current member of the cluster, not only to the seed nodes.
#### Gossip Protocol
A variation of *push-pull gossip* is used to reduce the amount of gossip
information sent around the cluster. In push-pull gossip a digest is sent
representing current versions but not actual values; the recipient of the gossip
can then send back any values for which it has newer versions and also request
values for which it has outdated versions. Akka uses a single shared state with
a vector clock for versioning, so the variant of push-pull gossip used in Akka
makes use of this version to only push the actual state as needed.
Periodically, the default is every 1 second, each node chooses another random
node to initiate a round of gossip with. If less than ½ of the nodes resides in the
seen set (have seen the new state) then the cluster gossips 3 times instead of once
every second. This adjusted gossip interval is a way to speed up the convergence process
in the early dissemination phase after a state change.
The choice of node to gossip with is random but biased towards nodes that might not have seen
the current state version. During each round of gossip exchange, when convergence is not yet reached, a node
uses a very high probability (which is configurable) to gossip with another node which is not part of the seen set, i.e.
which is likely to have an older version of the state. Otherwise it gossips with any random live node.
This biased selection is a way to speed up the convergence process in the late dissemination
phase after a state change.
For clusters larger than 400 nodes (configurable, and suggested by empirical evidence)
the 0.8 probability is gradually reduced to avoid overwhelming single stragglers with
too many concurrent gossip requests. The gossip receiver also has a mechanism to
protect itself from too many simultaneous gossip messages by dropping messages that
have been enqueued in the mailbox for too long of a time.
While the cluster is in a converged state the gossiper only sends a small gossip status message containing the gossip
version to the chosen node. As soon as there is a change to the cluster (meaning non-convergence)
then it goes back to biased gossip again.
The recipient of the gossip state or the gossip status can use the gossip version
(vector clock) to determine whether:
1. it has a newer version of the gossip state, in which case it sends that back
to the gossiper
2. it has an outdated version of the state, in which case the recipient requests
the current state from the gossiper by sending back its version of the gossip state
3. it has conflicting gossip versions, in which case the different versions are merged
and sent back
If the recipient and the gossip have the same version then the gossip state is
not sent or requested.
The periodic nature of the gossip has a nice batching effect of state changes,
e.g. joining several nodes quickly after each other to one node will result in only
one state change to be spread to other members in the cluster.
The gossip messages are serialized with [protobuf](https://code.google.com/p/protobuf/) and also gzipped to reduce payload
size.
### Membership Lifecycle
A node begins in the `joining` state. Once all nodes have seen that the new
node is joining (through gossip convergence) the `leader` will set the member
state to `up`.
If a node is leaving the cluster in a safe, expected manner then it switches to
the `leaving` state. Once the leader sees the convergence on the node in the
`leaving` state, the leader will then move it to `exiting`. Once all nodes
have seen the exiting state (convergence) the `leader` will remove the node
from the cluster, marking it as `removed`.
If a node is `unreachable` then gossip convergence is not possible and therefore
any `leader` actions are also not possible (for instance, allowing a node to
become a part of the cluster). To be able to move forward the state of the
`unreachable` nodes must be changed. It must become `reachable` again or marked
as `down`. If the node is to join the cluster again the actor system must be
restarted and go through the joining process again. The cluster can, through the
leader, also *auto-down* a node after a configured time of unreachability. If new
incarnation of unreachable node tries to rejoin the cluster old incarnation will be
marked as `down` and new incarnation can rejoin the cluster without manual intervention.
@@@ note
If you have *auto-down* enabled and the failure detector triggers, you
can over time end up with a lot of single node clusters if you don't put
measures in place to shut down nodes that have become `unreachable`. This
follows from the fact that the `unreachable` node will likely see the rest of
the cluster as `unreachable`, become its own leader and form its own cluster.
@@@
As mentioned before, if a node is `unreachable` then gossip convergence is not
possible and therefore any `leader` actions are also not possible. By enabling
`akka.cluster.allow-weakly-up-members` (enabled by default) it is possible to
let new joining nodes be promoted while convergence is not yet reached. These
`Joining` nodes will be promoted as `WeaklyUp`. Once gossip convergence is
reached, the leader will move `WeaklyUp` members to `Up`.
Note that members on the other side of a network partition have no knowledge about
the existence of the new members. You should for example not count `WeaklyUp`
members in quorum decisions.
#### State Diagram for the Member States (`akka.cluster.allow-weakly-up-members=off`)
![member-states.png](../images/member-states.png)
#### State Diagram for the Member States (`akka.cluster.allow-weakly-up-members=on`)
![member-states-weakly-up.png](../images/member-states-weakly-up.png)
#### Member States
* **joining** - transient state when joining a cluster
* **weakly up** - transient state while network split (only if `akka.cluster.allow-weakly-up-members=on`)
* **up** - normal operating state
* **leaving** / **exiting** - states during graceful removal
* **down** - marked as down (no longer part of cluster decisions)
* **removed** - tombstone state (no longer a member)
#### User Actions
* **join** - join a single node to a cluster - can be explicit or automatic on
startup if a node to join have been specified in the configuration
* **leave** - tell a node to leave the cluster gracefully
* **down** - mark a node as down
#### Leader Actions
The `leader` has the following duties:
* shifting members in and out of the cluster
* joining -> up
* weakly up -> up *(no convergence is required for this leader action to be performed)*
* exiting -> removed
#### Failure Detection and Unreachability
* **fd*** - the failure detector of one of the monitoring nodes has triggered
causing the monitored node to be marked as unreachable
* **unreachable*** - unreachable is not a real member states but more of a flag in addition to the state signaling that the cluster is unable to talk to this node, after being unreachable the failure detector may detect it as reachable again and thereby remove the flag

View file

@ -51,12 +51,12 @@ with a specific role. It communicates with other `Replicator` instances with the
actor using the `Replicator.props`. If it is started as an ordinary actor it is important actor using the `Replicator.props`. If it is started as an ordinary actor it is important
that it is given the same name, started on same path, on all nodes. that it is given the same name, started on same path, on all nodes.
Cluster members with status @ref:[WeaklyUp](cluster-usage.md#weakly-up), Cluster members with status @ref:[WeaklyUp](typed/cluster-membership.md#weakly-up),
will participate in Distributed Data. This means that the data will be replicated to the will participate in Distributed Data. This means that the data will be replicated to the
@ref:[WeaklyUp](cluster-usage.md#weakly-up) nodes with the background gossip protocol. Note that it @ref:[WeaklyUp](typed/cluster-membership.md#weakly-up) nodes with the background gossip protocol. Note that it
will not participate in any actions where the consistency mode is to read/write from all will not participate in any actions where the consistency mode is to read/write from all
nodes or the majority of nodes. The @ref:[WeaklyUp](cluster-usage.md#weakly-up) node is not counted nodes or the majority of nodes. The @ref:[WeaklyUp](typed/cluster-membership.md#weakly-up) node is not counted
as part of the cluster. So 3 nodes + 5 @ref:[WeaklyUp](cluster-usage.md#weakly-up) is essentially a as part of the cluster. So 3 nodes + 5 @ref:[WeaklyUp](typed/cluster-membership.md#weakly-up) is essentially a
3 node cluster as far as consistent actions are concerned. 3 node cluster as far as consistent actions are concerned.
Below is an example of an actor that schedules tick messages to itself and for each tick Below is an example of an actor that schedules tick messages to itself and for each tick

View file

@ -34,7 +34,7 @@ a few seconds. Changes are only performed in the own part of the registry and th
changes are versioned. Deltas are disseminated in a scalable way to other nodes with changes are versioned. Deltas are disseminated in a scalable way to other nodes with
a gossip protocol. a gossip protocol.
Cluster members with status @ref:[WeaklyUp](cluster-usage.md#weakly-up), Cluster members with status @ref:[WeaklyUp](typed/cluster-membership.md#weakly-up),
will participate in Distributed Publish Subscribe, i.e. subscribers on nodes with will participate in Distributed Publish Subscribe, i.e. subscribers on nodes with
`WeaklyUp` status will receive published messages if the publisher and subscriber are on `WeaklyUp` status will receive published messages if the publisher and subscriber are on
same side of a network partition. same side of a network partition.

View file

@ -0,0 +1,50 @@
<!--- #cluster-singleton --->
### Cluster Singleton
For some use cases it is convenient or necessary to ensure only one
actor of a certain type is running somewhere in the cluster.
This can be implemented by subscribing to member events, but there are several corner
cases to consider. Therefore, this specific use case is covered by the Cluster Singleton.
<!--- #cluster-singleton --->
<!--- #cluster-sharding --->
### Cluster Sharding
Distributes actors across several nodes in the cluster and supports interaction
with the actors using their logical identifier, but without having to care about
their physical location in the cluster.
<!--- #cluster-sharding --->
<!--- #cluster-ddata --->
### Distributed Data
*Akka Distributed Data* is useful when you need to share data between nodes in an
Akka Cluster. The data is accessed with an actor providing a key-value store like API.
<!--- #cluster-ddata --->
<!--- #cluster-pubsub --->
### Distributed Publish Subscribe
Publish-subscribe messaging between actors in the cluster, and point-to-point messaging
using the logical path of the actors, i.e. the sender does not have to know on which
node the destination actor is running.
<!--- #cluster-pubsub --->
<!--- #cluster-multidc --->
### Cluster across multiple data centers
Akka Cluster can be used across multiple data centers, availability zones or regions,
so that one Cluster can span multiple data centers and still be tolerant to network partitions.
<!--- #cluster-multidc --->
<!--- #join-seeds-programmatic --->
You may also join programmatically, which is attractive when dynamically discovering other nodes
at startup by using some external tool or API. When joining to seed nodes you should not include
the node itself except for the node that is supposed to be the first seed node, which should be
placed first in the parameter to the programmatic join.
<!--- #join-seeds-programmatic --->

View file

@ -7,7 +7,7 @@ For the new API see @ref[Cluster](typed/index-cluster.md).
@@@ index @@@ index
* [cluster-usage](cluster-usage.md) * [cluster-usage](cluster-usage.md)
* [cluster-routing](cluster-routing.md) * [cluster-routing](cluster-routing.md)
* [cluster-singleton](cluster-singleton.md) * [cluster-singleton](cluster-singleton.md)
* [distributed-pub-sub](distributed-pub-sub.md) * [distributed-pub-sub](distributed-pub-sub.md)

View file

@ -312,58 +312,26 @@ Watching a remote actor is not different than watching a local actor, as describ
### Failure Detector ### Failure Detector
Under the hood remote death watch uses heartbeat messages and a failure detector to generate `Terminated` Please see:
message from network failures and JVM crashes, in addition to graceful termination of watched
actor.
The heartbeat arrival times is interpreted by an implementation of * @ref:[Phi Accrual Failure Detector](typed/failure-detector.md) implementation for details
[The Phi Accrual Failure Detector](http://www.jaist.ac.jp/~defago/files/pdf/IS_RR_2004_010.pdf). * [Using the Failure Detector](#using-the-failure-detector) below for usage
The suspicion level of failure is given by a value called *phi*. ### Using the Failure Detector
The basic idea of the phi failure detector is to express the value of *phi* on a scale that
is dynamically adjusted to reflect current network conditions. Remoting uses the `akka.remote.PhiAccrualFailureDetector` failure detector by default, or you can provide your by
implementing the `akka.remote.FailureDetector` and configuring it:
The value of *phi* is calculated as:
``` ```
phi = -log10(1 - F(timeSinceLastHeartbeat)) akka.remote.watch-failure-detector.implementation-class = "com.example.CustomFailureDetector"
``` ```
where F is the cumulative distribution function of a normal distribution with mean In the [Remote Configuration](#remote-configuration) you may want to adjust these
and standard deviation estimated from historical heartbeat inter-arrival times. depending on you environment:
In the [Remote Configuration](#remote-configuration) you can adjust the `akka.remote.watch-failure-detector.threshold`
to define when a *phi* value is considered to be a failure.
A low `threshold` is prone to generate many false positives but ensures
a quick detection in the event of a real crash. Conversely, a high `threshold`
generates fewer mistakes but needs more time to detect actual crashes. The
default `threshold` is 10 and is appropriate for most situations. However in
cloud environments, such as Amazon EC2, the value could be increased to 12 in
order to account for network issues that sometimes occur on such platforms.
The following chart illustrates how *phi* increase with increasing time since the
previous heartbeat.
![phi1.png](./images/phi1.png)
Phi is calculated from the mean and standard deviation of historical
inter arrival times. The previous chart is an example for standard deviation
of 200 ms. If the heartbeats arrive with less deviation the curve becomes steeper,
i.e. it is possible to determine failure more quickly. The curve looks like this for
a standard deviation of 100 ms.
![phi2.png](./images/phi2.png)
To be able to survive sudden abnormalities, such as garbage collection pauses and
transient network failures the failure detector is configured with a margin,
`akka.remote.watch-failure-detector.acceptable-heartbeat-pause`. You may want to
adjust the [Remote Configuration](#remote-configuration) of this depending on you environment.
This is how the curve looks like for `acceptable-heartbeat-pause` configured to
3 seconds.
![phi3.png](./images/phi3.png)
* When a *phi* value is considered to be a failure `akka.remote.watch-failure-detector.threshold`
* Margin of error for sudden abnormalities `akka.remote.watch-failure-detector.acceptable-heartbeat-pause`
## Serialization ## Serialization
You need to enable @ref:[serialization](serialization.md) for your actor messages. You need to enable @ref:[serialization](serialization.md) for your actor messages.

View file

@ -41,7 +41,7 @@ implement manually.
@@@ @@@
@@@ warning { title=IMPORTANT } @@@ warning { title=IMPORTANT }
Use stream refs with Akka Cluster. The failure detector can cause quarantining if plain Akka remoting is used. Use stream refs with Akka Cluster. The [failure detector can cause quarantining](../typed/cluster-specification.md#quarantined) if plain Akka remoting is used.
@@@ @@@
## Stream References ## Stream References

View file

@ -0,0 +1,73 @@
<a id="when-and-where-to-use-akka-cluster"></a>
# Choosing Akka Cluster
An architectural choice you have to make is if you are going to use a microservices architecture or
a traditional distributed application. This choice will influence how you should use Akka Cluster.
## Microservices
Microservices has many attractive properties, such as the independent nature of microservices allows for
multiple smaller and more focused teams that can deliver new functionality more frequently and can
respond quicker to business opportunities. Reactive Microservices should be isolated, autonomous, and have
a single responsibility as identified by Jonas Bonér in the book
[Reactive Microsystems: The Evolution of Microservices at Scale](https://info.lightbend.com/ebook-reactive-microservices-the-evolution-of-microservices-at-scale-register.html).
In a microservices architecture, you should consider communication within a service and between services.
In general we recommend against using Akka Cluster and actor messaging between _different_ services because that
would result in a too tight code coupling between the services and difficulties deploying these independent of
each other, which is one of the main reasons for using a microservices architecture.
See the discussion on
@scala[[Internal and External Communication](https://www.lagomframework.com/documentation/current/scala/InternalAndExternalCommunication.html)]
@java[[Internal and External Communication](https://www.lagomframework.com/documentation/current/java/InternalAndExternalCommunication.html)]
in the docs of the [Lagom Framework](https://www.lagomframework.com) (where each microservice is an Akka Cluster)
for some background on this.
Nodes of a single service (collectively called a cluster) require less decoupling. They share the same code and
are deployed together, as a set, by a single team or individual. There might be two versions running concurrently
during a rolling deployment, but deployment of the entire set has a single point of control. For this reason,
intra-service communication can take advantage of Akka Cluster, failure management and actor messaging, which
is convenient to use and has great performance.
Between different services [Akka HTTP](https://doc.akka.io/docs/akka-http/current) or
[Akka gRPC](https://doc.akka.io/docs/akka-grpc/current/) can be used for synchronous (yet non-blocking)
communication and [Akka Streams Kafka](https://doc.akka.io/docs/akka-stream-kafka/current/home.html) or other
[Alpakka](https://doc.akka.io/docs/alpakka/current/) connectors for integration asynchronous communication.
All those communication mechanisms work well with streaming of messages with end-to-end back-pressure, and the
synchronous communication tools can also be used for single request response interactions. It is also important
to note that when using these tools both sides of the communication do not have to be implemented with Akka,
nor does the programming language matter.
## Traditional distributed application
We acknowledge that microservices also introduce many new challenges and it's not the only way to
build applications. A traditional distributed application may have less complexity and work well in many cases.
For example for a small startup, with a single team, building an application where time to market is everything.
Akka Cluster can efficiently be used for building such distributed application.
In this case, you have a single deployment unit, built from a single code base (or using traditional binary
dependency management to modularize) but deployed across many nodes using a single cluster.
Tighter coupling is OK, because there is a central point of deployment and control. In some cases, nodes may
have specialized runtime roles which means that the cluster is not totally homogenous (e.g., "front-end" and
"back-end" nodes, or dedicated master/worker nodes) but if these are run from the same built artifacts this
is just a runtime behavior and doesn't cause the same kind of problems you might get from tight coupling of
totally separate artifacts.
A tightly coupled distributed application has served the industry and many Akka users well for years and is
still a valid choice.
## Distributed monolith
There is also an anti-pattern that is sometimes called "distributed monolith". You have multiple services
that are built and deployed independently from each other, but they have a tight coupling that makes this
very risky, such as a shared cluster, shared code and dependencies for service API calls, or a shared
database schema. There is a false sense of autonomy because of the physical separation of the code and
deployment units, but you are likely to encounter problems because of changes in the implementation of
one service leaking into the behavior of others. See Ben Christensen's
[Dont Build a Distributed Monolith](https://www.microservices.com/talks/dont-build-a-distributed-monolith/).
Organizations that find themselves in this situation often react by trying to centrally coordinate deployment
of multiple services, at which point you have lost the principal benefit of microservices while taking on
the costs. You are in a halfway state with things that aren't really separable being built and deployed
in a separate way. Some people do this, and some manage to make it work, but it's not something we would
recommend and it needs to be carefully managed.

View file

@ -0,0 +1,158 @@
# Cluster Membership Service
The core of Akka Cluster is the cluster membership, to keep track of what nodes are part of the cluster and
their health. Cluster membership is communicated using @ref:[gossip](cluster-specification.md#gossip) and
@ref:[failure detection](cluster-specification.md#failure-detector).
There are several @ref:[Higher level Cluster tools](../typed/cluster.md#higher-level-cluster-tools) that are built
on top of the cluster membership service.
## Introduction
A cluster is made up of a set of member nodes. The identifier for each node is a
`hostname:port:uid` tuple. An Akka application can be distributed over a cluster with
each node hosting some part of the application. Cluster membership and the actors running
on that node of the application are decoupled. A node could be a member of a
cluster without hosting any actors. Joining a cluster is initiated
by issuing a `Join` command to one of the nodes in the cluster to join.
The node identifier internally also contains a UID that uniquely identifies this
actor system instance at that `hostname:port`. Akka uses the UID to be able to
reliably trigger remote death watch. This means that the same actor system can never
join a cluster again once it's been removed from that cluster. To re-join an actor
system with the same `hostname:port` to a cluster you have to stop the actor system
and start a new one with the same `hostname:port` which will then receive a different
UID.
## Member States
The cluster membership state is a specialized [CRDT](http://hal.upmc.fr/docs/00/55/55/88/PDF/techreport.pdf), which means that it has a monotonic
merge function. When concurrent changes occur on different nodes the updates can always be
merged and converge to the same end result.
* **joining** - transient state when joining a cluster
* **weakly up** - transient state while network split (only if `akka.cluster.allow-weakly-up-members=on`)
* **up** - normal operating state
* **leaving** / **exiting** - states during graceful removal
* **down** - marked as down (no longer part of cluster decisions)
* **removed** - tombstone state (no longer a member)
## Member Events
The events to track the life-cycle of members are:
* `ClusterEvent.MemberJoined` - A new member has joined the cluster and its status has been changed to `Joining`
* `ClusterEvent.MemberUp` - A new member has joined the cluster and its status has been changed to `Up`
* `ClusterEvent.MemberExited` - A member is leaving the cluster and its status has been changed to `Exiting`
Note that the node might already have been shutdown when this event is published on another node.
* `ClusterEvent.MemberRemoved` - Member completely removed from the cluster.
* `ClusterEvent.UnreachableMember` - A member is considered as unreachable, detected by the failure detector
of at least one other node.
* `ClusterEvent.ReachableMember` - A member is considered as reachable again, after having been unreachable.
All nodes that previously detected it as unreachable has detected it as reachable again.
## Membership Lifecycle
A node begins in the `joining` state. Once all nodes have seen that the new
node is joining (through gossip convergence) the `leader` will set the member
state to `up`.
If a node is leaving the cluster in a safe, expected manner then it switches to
the `leaving` state. Once the leader sees the convergence on the node in the
`leaving` state, the leader will then move it to `exiting`. Once all nodes
have seen the exiting state (convergence) the `leader` will remove the node
from the cluster, marking it as `removed`.
If a node is `unreachable` then gossip convergence is not possible and therefore
any `leader` actions are also not possible (for instance, allowing a node to
become a part of the cluster). To be able to move forward the state of the
`unreachable` nodes must be changed. It must become `reachable` again or marked
as `down`. If the node is to join the cluster again the actor system must be
restarted and go through the joining process again. The cluster can, through the
leader, also *auto-down* a node after a configured time of unreachability. If new
incarnation of unreachable node tries to rejoin the cluster old incarnation will be
marked as `down` and new incarnation can rejoin the cluster without manual intervention.
@@@ note
If you have *auto-down* enabled and the failure detector triggers, you
can over time end up with a lot of single node clusters if you don't put
measures in place to shut down nodes that have become `unreachable`. This
follows from the fact that the `unreachable` node will likely see the rest of
the cluster as `unreachable`, become its own leader and form its own cluster.
@@@
<a id="weakly-up"></a>
## WeaklyUp Members
If a node is `unreachable` then gossip convergence is not possible and therefore any
`leader` actions are also not possible. However, we still might want new nodes to join
the cluster in this scenario.
`Joining` members will be promoted to `WeaklyUp` and become part of the cluster if
convergence can't be reached. Once gossip convergence is reached, the leader will move `WeaklyUp`
members to `Up`.
This feature is enabled by default, but it can be disabled with configuration option:
```
akka.cluster.allow-weakly-up-members = off
```
You can subscribe to the `WeaklyUp` membership event to make use of the members that are
in this state, but you should be aware of that members on the other side of a network partition
have no knowledge about the existence of the new members. You should for example not count
`WeaklyUp` members in quorum decisions.
As mentioned before, if a node is `unreachable` then gossip convergence is not
possible and therefore any `leader` actions are also not possible. By enabling
`akka.cluster.allow-weakly-up-members` (enabled by default) it is possible to
let new joining nodes be promoted while convergence is not yet reached. These
`Joining` nodes will be promoted as `WeaklyUp`. Once gossip convergence is
reached, the leader will move `WeaklyUp` members to `Up`.
Note that members on the other side of a network partition have no knowledge about
the existence of the new members. You should for example not count `WeaklyUp`
members in quorum decisions.
## State Diagrams
### State Diagram for the Member States (`akka.cluster.allow-weakly-up-members=off`)
![member-states.png](../images/member-states.png)
### State Diagram for the Member States (`akka.cluster.allow-weakly-up-members=on`)
![member-states-weakly-up.png](../images/member-states-weakly-up.png)
#### User Actions
* **join** - join a single node to a cluster - can be explicit or automatic on
startup if a node to join have been specified in the configuration
* **leave** - tell a node to leave the cluster gracefully
* **down** - mark a node as down
#### Leader Actions
The `leader` has the following duties:
* shifting members in and out of the cluster
* joining -> up
* weakly up -> up *(no convergence is required for this leader action to be performed)*
* exiting -> removed
#### Failure Detection and Unreachability
* **fd*** - the failure detector of one of the monitoring nodes has triggered
causing the monitored node to be marked as unreachable
* **unreachable*** - unreachable is not a real member states but more of a flag in addition to the state signaling that the cluster is unable to talk to this node, after being unreachable the failure detector may detect it as reachable again and thereby remove the flag

View file

@ -0,0 +1,184 @@
# Cluster Specification
This document describes the design concepts of Akka Cluster. For the guide on using Akka Cluster please see either
* @ref:[Cluster Usage](../typed/cluster.md)
* @ref:[Cluster Usage with classic Akka APIs](../cluster-usage.md)
* @ref:[Cluster Membership Service](cluster-membership.md)
## Introduction
Akka Cluster provides a fault-tolerant decentralized peer-to-peer based
@ref:[Cluster Membership Service](cluster-membership.md#cluster-membership-service) with no single point of failure or
single point of bottleneck. It does this using [gossip](#gossip) protocols and an automatic [failure detector](#failure-detector).
Akka Cluster allows for building distributed applications, where one application or service spans multiple nodes
(in practice multiple `ActorSystem`s).
## Terms
**node**
: A logical member of a cluster. There could be multiple nodes on a physical
machine. Defined by a *hostname:port:uid* tuple.
**cluster**
: A set of nodes joined together through the @ref:[Cluster Membership Service](cluster-membership.md#cluster-membership-service).
**leader**
: A single node in the cluster that acts as the leader. Managing cluster convergence
and membership state transitions.
### Gossip
The cluster membership used in Akka is based on Amazon's [Dynamo](http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf) system and
particularly the approach taken in Basho's' [Riak](http://basho.com/technology/architecture/) distributed database.
Cluster membership is communicated using a [Gossip Protocol](http://en.wikipedia.org/wiki/Gossip_protocol), where the current
state of the cluster is gossiped randomly through the cluster, with preference to
members that have not seen the latest version.
#### Vector Clocks
[Vector clocks](http://en.wikipedia.org/wiki/Vector_clock) are a type of data structure and algorithm for generating a partial
ordering of events in a distributed system and detecting causality violations.
We use vector clocks to reconcile and merge differences in cluster state
during gossiping. A vector clock is a set of (node, counter) pairs. Each update
to the cluster state has an accompanying update to the vector clock.
#### Gossip Convergence
Information about the cluster converges locally at a node at certain points in time.
This is when a node can prove that the cluster state he is observing has been observed
by all other nodes in the cluster. Convergence is implemented by passing a set of nodes
that have seen current state version during gossip. This information is referred to as the
seen set in the gossip overview. When all nodes are included in the seen set there is
convergence.
Gossip convergence cannot occur while any nodes are `unreachable`. The nodes need
to become `reachable` again, or moved to the `down` and `removed` states
(see the @ref:[Cluster Membership Lifecycle](cluster-membership.md#membership-lifecycle) section). This only blocks the leader
from performing its cluster membership management and does not influence the application
running on top of the cluster. For example this means that during a network partition
it is not possible to add more nodes to the cluster. The nodes can join, but they
will not be moved to the `up` state until the partition has healed or the unreachable
nodes have been downed.
#### Failure Detector
The failure detector in Akka Cluster is responsible for trying to detect if a node is
`unreachable` from the rest of the cluster. For this we are using the
@ref:[Phi Accrual Failure Detector](failure-detector.md) implementation.
To be able to survive sudden abnormalities, such as garbage collection pauses and
transient network failures the failure detector is easily @ref:[configurable](cluster.md#using-the-failure-detector)
for tuning to your environments and needs.
In a cluster each node is monitored by a few (default maximum 5) other nodes.
The nodes to monitor are selected from neighbors in a hashed ordered node ring.
This is to increase the likelihood to monitor across racks and data centers, but the order
is the same on all nodes, which ensures full coverage.
When any node is detected to be `unreachable` this data is spread to
the rest of the cluster through the @ref:[gossip](#gossip). In other words, only one node needs to
mark a node `unreachable` to have the rest of the cluster mark that node `unreachable`.
The failure detector will also detect if the node becomes `reachable` again. When
all nodes that monitored the `unreachable` node detect it as `reachable` again
the cluster, after gossip dissemination, will consider it as `reachable`.
<a id="quarantined"></a>
If system messages cannot be delivered to a node it will be quarantined and then it
cannot come back from `unreachable`. This can happen if the there are too many
unacknowledged system messages (e.g. watch, Terminated, remote actor deployment,
failures of actors supervised by remote parent). Then the node needs to be moved
to the `down` or `removed` states (see @ref:[Cluster Membership Lifecycle](cluster-membership.md#membership-lifecycle))
and the actor system of the quarantined node must be restarted before it can join the cluster again.
See the following for more details:
* @ref:[Phi Accrual Failure Detector](failure-detector.md) implementation
* @ref:[Using the Failure Detector](cluster.md#using-the-failure-detector)
#### Leader
After gossip convergence a `leader` for the cluster can be determined. There is no
`leader` election process, the `leader` can always be recognised deterministically
by any node whenever there is gossip convergence. The leader is only a role, any node
can be the leader and it can change between convergence rounds.
The `leader` is the first node in sorted order that is able to take the leadership role,
where the preferred member states for a `leader` are `up` and `leaving`
(see the @ref:[Cluster Membership Lifecycle](cluster-membership.md#membership-lifecycle) for more information about member states).
The role of the `leader` is to shift members in and out of the cluster, changing
`joining` members to the `up` state or `exiting` members to the `removed`
state. Currently `leader` actions are only triggered by receiving a new cluster
state with gossip convergence.
The `leader` also has the power, if configured so, to "auto-down" a node that
according to the @ref:[Failure Detector](#failure-detector) is considered `unreachable`. This means setting
the `unreachable` node status to `down` automatically after a configured time
of unreachability.
#### Seed Nodes
The seed nodes are contact points for new nodes joining the cluster.
When a new node is started it sends a message to all seed nodes and then sends
a join command to the seed node that answers first.
The seed nodes configuration value does not have any influence on the running
cluster itself, it is only relevant for new nodes joining the cluster as it
helps them to find contact points to send the join command to; a new member
can send this command to any current member of the cluster, not only to the seed nodes.
#### Gossip Protocol
A variation of *push-pull gossip* is used to reduce the amount of gossip
information sent around the cluster. In push-pull gossip a digest is sent
representing current versions but not actual values; the recipient of the gossip
can then send back any values for which it has newer versions and also request
values for which it has outdated versions. Akka uses a single shared state with
a vector clock for versioning, so the variant of push-pull gossip used in Akka
makes use of this version to only push the actual state as needed.
Periodically, the default is every 1 second, each node chooses another random
node to initiate a round of gossip with. If less than ½ of the nodes resides in the
seen set (have seen the new state) then the cluster gossips 3 times instead of once
every second. This adjusted gossip interval is a way to speed up the convergence process
in the early dissemination phase after a state change.
The choice of node to gossip with is random but biased towards nodes that might not have seen
the current state version. During each round of gossip exchange, when convergence is not yet reached, a node
uses a very high probability (which is configurable) to gossip with another node which is not part of the seen set, i.e.
which is likely to have an older version of the state. Otherwise it gossips with any random live node.
This biased selection is a way to speed up the convergence process in the late dissemination
phase after a state change.
For clusters larger than 400 nodes (configurable, and suggested by empirical evidence)
the 0.8 probability is gradually reduced to avoid overwhelming single stragglers with
too many concurrent gossip requests. The gossip receiver also has a mechanism to
protect itself from too many simultaneous gossip messages by dropping messages that
have been enqueued in the mailbox for too long of a time.
While the cluster is in a converged state the gossiper only sends a small gossip status message containing the gossip
version to the chosen node. As soon as there is a change to the cluster (meaning non-convergence)
then it goes back to biased gossip again.
The recipient of the gossip state or the gossip status can use the gossip version
(vector clock) to determine whether:
1. it has a newer version of the gossip state, in which case it sends that back
to the gossiper
2. it has an outdated version of the state, in which case the recipient requests
the current state from the gossiper by sending back its version of the gossip state
3. it has conflicting gossip versions, in which case the different versions are merged
and sent back
If the recipient and the gossip have the same version then the gossip state is
not sent or requested.
The periodic nature of the gossip has a nice batching effect of state changes,
e.g. joining several nodes quickly after each other to one node will result in only
one state change to be spread to other members in the cluster.
The gossip messages are serialized with [protobuf](https://code.google.com/p/protobuf/) and also gzipped to reduce payload
size.

View file

@ -1,8 +1,18 @@
# Cluster Membership # Cluster Usage
This document describes how to use Akka Cluster and the Cluster APIs.
For specific documentation topics see:
* @ref:[Cluster Specification](cluster-specification.md)
* @ref:[Cluster Membership Service](cluster-membership.md)
* @ref:[When and where to use Akka Cluster](choosing-cluster.md)
* @ref:[Higher level Cluster tools](#higher-level-cluster-tools)
* @ref:[Rolling Updates](../additional/rolling-updates.md)
* @ref:[Operating, Managing, Observability](../additional/operations.md)
## Dependency ## Dependency
To use Akka Cluster Typed, you must add the following dependency in your project: To use Akka Cluster add the following dependency in your project:
@@dependency[sbt,Maven,Gradle] { @@dependency[sbt,Maven,Gradle] {
group=com.typesafe.akka group=com.typesafe.akka
@ -10,16 +20,17 @@ To use Akka Cluster Typed, you must add the following dependency in your project
version=$akka.version$ version=$akka.version$
} }
## Introduction ## Cluster API Extension
For an introduction to Akka Cluster concepts see @ref:[Cluster Specification](../common/cluster.md). This documentation shows how to use the typed The Cluster extension gives you access to management tasks such as @ref:[Joining, Leaving and Downing](cluster-membership.md#user-actions)
Cluster API. and subscription of cluster membership events such as @ref:[MemberUp, MemberRemoved and UnreachableMember](cluster-membership.md#membership-lifecycle),
which are exposed as event APIs.
You need to enable @ref:[serialization](../serialization.md) for your actor messages. It does this through these references are on the `Cluster` extension:
@ref:[Serialization with Jackson](../serialization-jackson.md) is a good choice in many cases and our
recommendation if you don't have other preference.
## Examples * manager: An @scala[`ActorRef[akka.cluster.typed.ClusterCommand]`]@java[`ActorRef<akka.cluster.typed.ClusterCommand>`] where a `ClusterCommand` is a command such as: `Join`, `Leave` and `Down`
* subscriptions: An @scala[`ActorRef[akka.cluster.typed.ClusterStateSubscription]`]@java[`ActorRef<akka.cluster.typed.ClusterStateSubscription>`] where a `ClusterStateSubscription` is one of `GetCurrentState` or `Subscribe` and `Unsubscribe` to cluster events like `MemberRemoved`
* state: The current `CurrentClusterState`
All of the examples below assume the following imports: All of the examples below assume the following imports:
@ -29,18 +40,12 @@ Scala
Java Java
: @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-imports } : @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-imports }
<a id="basic-cluster-configuration"></a>
And the minimum configuration required is to set a host/port for remoting and the `akka.actor.provider = "cluster"`. And the minimum configuration required is to set a host/port for remoting and the `akka.actor.provider = "cluster"`.
@@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #config-seeds } @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #config-seeds }
Starting the cluster on each node:
## Cluster API extension
The typed Cluster extension gives access to management tasks (Joining, Leaving, Downing, …) and subscription of
cluster membership events (MemberUp, MemberRemoved, UnreachableMember, etc). Those are exposed as two different actor
references, i.e. its a message based API.
The references are on the `Cluster` extension:
Scala Scala
: @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-create } : @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-create }
@ -48,16 +53,15 @@ Scala
Java Java
: @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-create } : @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-create }
The Cluster extensions gives you access to: @@@ note
The name of the cluster's `ActorSystem` must be the same for all members, which is passed in when you start the `ActorSystem`.
* manager: An @scala[`ActorRef[ClusterCommand]`]@java[`ActorRef<ClusterCommand>`] where a `ClusterCommand` is a command such as: `Join`, `Leave` and `Down` @@@
* subscriptions: An `ActorRef[ClusterStateSubscription]` where a `ClusterStateSubscription` is one of `GetCurrentState` or `Subscribe` and `Unsubscribe` to cluster events like `MemberRemoved`
* state: The current `CurrentClusterState`
### Joining and Leaving a Cluster
### Cluster Management If not using configuration to specify [seed nodes to join](#joining-to-seed-nodes), joining the cluster can be done programmatically via the `manager`.
If not using configuration to specify seeds joining the cluster can be done programmatically via the `manager`.
Scala Scala
: @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-join } : @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-join }
@ -65,7 +69,7 @@ Scala
Java Java
: @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-join } : @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-join }
Leaving and downing are similar e.g. [Leaving](#leaving) the cluster and [downing](#downing) a node are similar:
Scala Scala
: @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-leave } : @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-leave }
@ -73,13 +77,13 @@ Scala
Java Java
: @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-leave } : @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-leave }
### Cluster subscriptions ### Cluster Subscriptions
Cluster `subscriptions` can be used to receive messages when cluster state changes. For example, registering Cluster `subscriptions` can be used to receive messages when cluster state changes. For example, registering
for all `MemberEvent`s, then using the `manager` to have a node leave the cluster will result in events for all `MemberEvent`s, then using the `manager` to have a node leave the cluster will result in events
for the node going through the lifecycle described in @ref:[Cluster Specification](../common/cluster.md). for the node going through the @ref:[Membership Lifecycle](cluster-membership.md#membership-lifecycle).
This example subscribes with a @scala[`subscriber: ActorRef[MemberEvent]`]@java[`ActorRef<MemberEvent> subscriber`]: This example subscribes to a @scala[`subscriber: ActorRef[MemberEvent]`]@java[`ActorRef<MemberEvent> subscriber`]:
Scala Scala
: @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-subscribe } : @@snip [BasicClusterExampleSpec.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/BasicClusterExampleSpec.scala) { #cluster-subscribe }
@ -95,15 +99,328 @@ Scala
Java Java
: @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-leave-example } : @@snip [BasicClusterExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/BasicClusterExampleTest.java) { #cluster-leave-example }
## Serialization
See [serialization](https://doc.akka.io/docs/akka/current/scala/serialization.html) for how messages are sent between ### Cluster State
ActorSystems. Actor references are typically included in the messages,
since there is no `sender`. To serialize actor references to/from string representation you will use the `ActorRefResolver`. Instead of subscribing to cluster events it can sometimes be convenient to only get the full membership state with
For example here's how a serializer could look for the `Ping` and `Pong` messages above: @scala[`Cluster(system).state`]@java[`Cluster.get(system).state()`]. Note that this state is not necessarily in sync with the events published to a
cluster subscription.
See [Cluster Membership](cluster-membership.md#member-events) more information on member events specifically.
There are more types of change events, consult the API documentation
of classes that extends `akka.cluster.ClusterEvent.ClusterDomainEvent` for details about the events.
## Cluster Membership API
The `akka.cluster.seed-nodes` are initial contact points for [automatically](#joining-automatically-to-seed-nodes-with-cluster-bootstrap)
or [manually](#joining-configured-seed-nodes) joining a cluster.
After the joining process the seed nodes are not special and they participate in the cluster in exactly the same
way as other nodes.
### Joining configured seed nodes
When a new node is started it sends a message to all seed nodes and then sends a join command to the one that
answers first. If no one of the seed nodes replied (might not be started yet)
it retries this procedure until successful or shutdown.
You define the seed nodes in the [configuration](#configuration) file (application.conf):
This can also be defined as Java system properties when starting the JVM using the following syntax:
```
-Dakka.cluster.seed-nodes.0=akka://ClusterSystem@host1:2552
-Dakka.cluster.seed-nodes.1=akka://ClusterSystem@host2:2552
```
The seed nodes can be started in any order and it is not necessary to have all
seed nodes running, but the node configured as the **first element** in the `seed-nodes`
configuration list must be started when initially starting a cluster, otherwise the
other seed-nodes will not become initialized and no other node can join the cluster.
The reason for the special first seed node is to avoid forming separated islands when
starting from an empty cluster.
It is quickest to start all configured seed nodes at the same time (order doesn't matter),
otherwise it can take up to the configured `seed-node-timeout` until the nodes
can join.
Once more than two seed nodes have been started it is no problem to shut down the first
seed node. If the first seed node is restarted, it will first try to join the other
seed nodes in the existing cluster. Note that if you stop all seed nodes at the same time
and restart them with the same `seed-nodes` configuration they will join themselves and
form a new cluster instead of joining remaining nodes of the existing cluster. That is
likely not desired and should be avoided by listing several nodes as seed nodes for redundancy
and don't stop all of them at the same time.
Note that if you are going to start the nodes on different machines you need to specify the
ip-addresses or host names of the machines in `application.conf` instead of `127.0.0.1`
### Joining automatically to seed nodes with Cluster Bootstrap
Automatic discovery of nodes for the joining process is available
using the open source Akka Management project's module,
@ref:[Cluster Bootstrap](../additional/operations.md#cluster-bootstrap).
Please refer to its documentation for more details.
### Joining programmatically to seed nodes
@@include[cluster.md](../includes/cluster.md) { #join-seeds-programmatic }
### Tuning joins
Unsuccessful attempts to contact seed nodes are automatically retried after the time period defined in
configuration property `seed-node-timeout`. Unsuccessful attempt to join a specific seed node is
automatically retried after the configured `retry-unsuccessful-join-after`. Retrying means that it
tries to contact all seed nodes and then joins the node that answers first. The first node in the list
of seed nodes will join itself if it cannot contact any of the other seed nodes within the
configured `seed-node-timeout`.
The joining of given seed nodes will by default be retried indefinitely until
a successful join. That process can be aborted if unsuccessful by configuring a
timeout. When aborted it will run @ref:[Coordinated Shutdown](../actors.md#coordinated-shutdown),
which by default will terminate the ActorSystem. CoordinatedShutdown can also be configured to exit
the JVM. It is useful to define this timeout if the `seed-nodes` are assembled
dynamically and a restart with new seed-nodes should be tried after unsuccessful
attempts.
```
akka.cluster.shutdown-after-unsuccessful-join-seed-nodes = 20s
akka.coordinated-shutdown.terminate-actor-system = on
```
If you don't configure seed nodes or use one of the join seed node functions you need to join the cluster manually,
which can be performed by using [JMX](#jmx) or [HTTP](#http).
You can join to any node in the cluster. It does not have to be configured as a seed node.
Note that you can only join to an existing cluster member, which means that for bootstrapping some
node must join itself,and then the following nodes could join them to make up a cluster.
An actor system can only join a cluster once. Additional attempts will be ignored.
When it has successfully joined it must be restarted to be able to join another
cluster or to join the same cluster again. It can use the same host name and port
after the restart, when it come up as new incarnation of existing member in the cluster,
trying to join in, then the existing one will be removed from the cluster and then it will
be allowed to join.
<a id="automatic-vs-manual-downing"></a>
### Downing
When a member is considered by the failure detector to be `unreachable` the
leader is not allowed to perform its duties, such as changing status of
new joining members to 'Up'. The node must first become `reachable` again, or the
status of the unreachable member must be changed to 'Down'. Changing status to 'Down'
can be performed automatically or manually. By default it must be done manually, using
@ref:[JMX](../additional/operations.md#jmx) or @ref:[HTTP](../additional/operations.md#http).
It can also be performed programmatically with @scala[`Cluster(system).down(address)`]@java[`Cluster.get(system).down(address)`].
If a node is still running and sees its self as Down it will shutdown. @ref:[Coordinated Shutdown](../actors.md#coordinated-shutdown) will automatically
run if `run-coordinated-shutdown-when-down` is set to `on` (the default) however the node will not try
and leave the cluster gracefully so sharding and singleton migration will not occur.
A production solution for the downing problem is provided by
[Split Brain Resolver](http://developer.lightbend.com/docs/akka-commercial-addons/current/split-brain-resolver.html),
which is part of the [Lightbend Reactive Platform](http://www.lightbend.com/platform).
If you dont use RP, you should anyway carefully read the [documentation](http://developer.lightbend.com/docs/akka-commercial-addons/current/split-brain-resolver.html)
of the Split Brain Resolver and make sure that the solution you are using handles the concerns
described there.
### Auto-downing (DO NOT USE)
There is an automatic downing feature that you should not use in production. For testing you can enable it with configuration:
```
akka.cluster.auto-down-unreachable-after = 120s
```
This means that the cluster leader member will change the `unreachable` node
status to `down` automatically after the configured time of unreachability.
This is a naïve approach to remove unreachable nodes from the cluster membership.
It can be useful during development but in a production environment it will eventually breakdown the cluster.
When a network partition occurs, both sides of the partition will see the other side as unreachable and remove it from the cluster.
This results in the formation of two separate, disconnected, clusters (known as *Split Brain*).
This behaviour is not limited to network partitions. It can also occur if a node
in the cluster is overloaded, or experiences a long GC pause.
@@@ warning
We recommend against using the auto-down feature of Akka Cluster in production. It
has multiple undesirable consequences for production systems.
If you are using @ref:[Cluster Singleton](cluster-singleton.md) or @ref:[Cluster Sharding](cluster-sharding.md) it can break the contract provided by
those features. Both provide a guarantee that an actor will be unique in a cluster.
With the auto-down feature enabled, it is possible for multiple independent clusters
to form (*Split Brain*). When this happens the guaranteed uniqueness will no
longer be true resulting in undesirable behaviour in the system.
This is even more severe when @ref:[Akka Persistence](persistence.md) is used in
conjunction with Cluster Sharding. In this case, the lack of unique actors can
cause multiple actors to write to the same journal. Akka Persistence operates on a
single writer principle. Having multiple writers will corrupt the journal
and make it unusable.
Finally, even if you don't use features such as Persistence, Sharding, or Singletons,
auto-downing can lead the system to form multiple small clusters. These small
clusters will be independent from each other. They will be unable to communicate
and as a result you may experience performance degradation. Once this condition
occurs, it will require manual intervention in order to reform the cluster.
Because of these issues, auto-downing should **never** be used in a production environment.
@@@
### Leaving
There are two ways to remove a member from the cluster.
1. The recommended way to leave a cluster is a graceful exit, informing the cluster that a node shall leave.
This can be performed using @ref:[JMX](../additional/operations.md#jmx) or @ref:[HTTP](../additional/operations.md#http).
This method will offer faster hand off to peer nodes during node shutdown.
1. When a graceful exit is not possible, you can stop the actor system (or the JVM process, for example a SIGTERM sent from the environment). It will be detected
as unreachable and removed after the automatic or manual downing.
The @ref:[Coordinated Shutdown](../actors.md#coordinated-shutdown) will automatically run when the cluster node sees itself as
`Exiting`, i.e. leaving from another node will trigger the shutdown process on the leaving node.
Tasks for graceful leaving of cluster including graceful shutdown of Cluster Singletons and
Cluster Sharding are added automatically when Akka Cluster is used, i.e. running the shutdown
process will also trigger the graceful leaving if it's not already in progress.
Normally this is handled automatically, but in case of network failures during this process it might still
be necessary to set the nodes status to `Down` in order to complete the removal. For handling network failures
see [Split Brain Resolver](http://developer.lightbend.com/docs/akka-commercial-addons/current/split-brain-resolver.html),
part of the [Lightbend Reactive Platform](http://www.lightbend.com/platform).
## Serialization
Enable @ref:[serialization](../serialization.md) to send events between ActorSystems.
@ref:[Serialization with Jackson](../serialization-jackson.md) is a good choice in many cases, and our
recommendation if you don't have other preferences or constraints.
Actor references are typically included in the messages, since there is no `sender`.
To serialize actor references to/from string representation you would use the `ActorRefResolver`.
For example here's how a serializer could look for `Ping` and `Pong` messages:
Scala Scala
: @@snip [PingSerializer.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/PingSerializer.scala) { #serializer } : @@snip [PingSerializer.scala](/akka-cluster-typed/src/test/scala/docs/akka/cluster/typed/PingSerializer.scala) { #serializer }
Java Java
: @@snip [PingSerializerExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/PingSerializerExampleTest.java) { #serializer } : @@snip [PingSerializerExampleTest.java](/akka-cluster-typed/src/test/java/jdocs/akka/cluster/typed/PingSerializerExampleTest.java) { #serializer }
You can look at the
@java[@extref[Cluster example project](samples:akka-samples-cluster-java)]
@scala[@extref[Cluster example project](samples:akka-samples-cluster-scala)]
to see what this looks like in practice.
## Node Roles
Not all nodes of a cluster need to perform the same function: there might be one sub-set which runs the web front-end,
one which runs the data access layer and one for the number-crunching. Deployment of actors, for example by cluster-aware
routers, can take node roles into account to achieve this distribution of responsibilities.
The node roles are defined in the configuration property named `akka.cluster.roles`
and typically defined in the start script as a system property or environment variable.
The roles are part of the membership information in `MemberEvent` that you can subscribe to.
## Failure Detector
The nodes in the cluster monitor each other by sending heartbeats to detect if a node is
unreachable from the rest of the cluster. Please see:
* @ref:[Failure Detector specification](cluster-specification.md#failure-detector)
* @ref:[Phi Accrual Failure Detector](failure-detector.md) implementation
* [Using the Failure Detector](#using-the-failure-detector)
### Using the Failure Detector
Cluster uses the `akka.remote.PhiAccrualFailureDetector` failure detector by default, or you can provide your by
implementing the `akka.remote.FailureDetector` and configuring it:
```
akka.cluster.implementation-class = "com.example.CustomFailureDetector"
```
In the [Cluster Configuration](#configuration) you may want to adjust these
depending on you environment:
* When a *phi* value is considered to be a failure `akka.cluster.failure-detector.threshold`
* Margin of error for sudden abnormalities `akka.cluster.failure-detector.acceptable-heartbeat-pause`
## How to test
Akka comes with and uses several types of testing strategies:
* @ref:[Testing](testing.md)
* @ref:[Multi Node Testing](../multi-node-testing.md)
* @ref:[Multi JVM Testing](../multi-jvm-testing.md)
## Configuration
There are several configuration properties for the cluster. Refer to the
@ref:[reference configuration](../general/configuration.md#config-akka-cluster) for full
configuration descriptions, default values and options.
### Cluster Info Logging
You can silence the logging of cluster events at info level with configuration property:
```
akka.cluster.log-info = off
```
You can enable verbose logging of cluster events at info level, e.g. for temporary troubleshooting, with configuration property:
```
akka.cluster.log-info-verbose = on
```
### Cluster Dispatcher
The cluster extension is implemented with actors. To protect them against
disturbance from user actors they are by default run on the internal dispatcher configured
under `akka.actor.internal-dispatcher`. The cluster actors can potentially be isolated even
further, onto their own dispatcher using the setting `akka.cluster.use-dispatcher`
or made run on the same dispatcher to keep the number of threads down.
### Configuration Compatibility Check
Creating a cluster is about deploying two or more nodes and make then behave as if they were one single application. Therefore it's extremely important that all nodes in a cluster are configured with compatible settings.
The Configuration Compatibility Check feature ensures that all nodes in a cluster have a compatible configuration. Whenever a new node is joining an existing cluster, a subset of its configuration settings (only those that are required to be checked) is sent to the nodes in the cluster for verification. Once the configuration is checked on the cluster side, the cluster sends back its own set of required configuration settings. The joining node will then verify if it's compliant with the cluster configuration. The joining node will only proceed if all checks pass, on both sides.
New custom checkers can be added by extending `akka.cluster.JoinConfigCompatChecker` and including them in the configuration. Each checker must be associated with a unique key:
```
akka.cluster.configuration-compatibility-check.checkers {
my-custom-config = "com.company.MyCustomJoinConfigCompatChecker"
}
```
@@@ note
Configuration Compatibility Check is enabled by default, but can be disabled by setting `akka.cluster.configuration-compatibility-check.enforce-on-join = off`. This is specially useful when performing rolling updates. Obviously this should only be done if a complete cluster shutdown isn't an option. A cluster with nodes with different configuration settings may lead to data loss or data corruption.
This setting should only be disabled on the joining nodes. The checks are always performed on both sides, and warnings are logged. In case of incompatibilities, it is the responsibility of the joining node to decide if the process should be interrupted or not.
If you are performing a rolling update on cluster using Akka 2.5.9 or prior (thus, not supporting this feature), the checks will not be performed because the running cluster has no means to verify the configuration sent by the joining node, nor to send back its own configuration.
@@@
## Higher level Cluster tools
@@include[cluster.md](../includes/cluster.md) { #cluster-singleton }
See @ref:[Cluster Singleton](cluster-singleton.md).
@@include[cluster.md](../includes/cluster.md) { #cluster-sharding }
See @ref:[Cluster Sharding](cluster-sharding.md).
@@include[cluster.md](../includes/cluster.md) { #cluster-ddata }
See @ref:[Distributed Data](distributed-data.md).
@@include[cluster.md](../includes/cluster.md) { #cluster-pubsub }
Classic Pub Sub can be used by leveraging the `.toClassic` adapters.
See @ref:[Distributed Publish Subscribe in Cluster](../distributed-pub-sub.md). The API is @github[#26338](#26338).
@@include[cluster.md](../includes/cluster.md) { #cluster-multidc }
See @ref:[Cluster Multi-DC](../cluster-dc.md). The API for multi-dc sharding is @github[#27705](#27705).

View file

@ -0,0 +1,72 @@
# Phi Accrual Failure Detector
## Introduction
Remote DeathWatch uses heartbeat messages and the failure detector to
* detect network failures and JVM crashes
* generate the `Terminated` message to the watching actor on failure
* gracefully terminate watched actors
The heartbeat arrival times are interpreted by an implementation of
[The Phi Accrual Failure Detector](https://pdfs.semanticscholar.org/11ae/4c0c0d0c36dc177c1fff5eb84fa49aa3e1a8.pdf) by Hayashibara et al.
## Failure Detector Heartbeats
Heartbeats are sent every second by default, which is configurable. They are performed in a request/reply handshake, and the replies are input to the failure detector.
The suspicion level of failure is represented by a value called *phi*.
The basic idea of the phi failure detector is to express the value of *phi* on a scale that
is dynamically adjusted to reflect current network conditions.
The value of *phi* is calculated as:
```
phi = -log10(1 - F(timeSinceLastHeartbeat))
```
where F is the cumulative distribution function of a normal distribution with mean
and standard deviation estimated from historical heartbeat inter-arrival times.
An accrual failure detector decouples monitoring and interpretation. That makes
them applicable to a wider area of scenarios and more adequate to build generic
failure detection services. The idea is that it is keeping a history of failure
statistics, calculated from heartbeats received from other nodes, and is
trying to do educated guesses by taking multiple factors, and how they
accumulate over time, into account in order to come up with a better guess if a
specific node is up or down. Rather than only answering "yes" or "no" to the
question "is the node down?" it returns a `phi` value representing the
likelihood that the node is down.
The following chart illustrates how *phi* increase with increasing time since the
previous heartbeat.
![phi1.png](../images/phi1.png)
Phi is calculated from the mean and standard deviation of historical
inter arrival times. The previous chart is an example for standard deviation
of 200 ms. If the heartbeats arrive with less deviation the curve becomes steeper,
i.e. it is possible to determine failure more quickly. The curve looks like this for
a standard deviation of 100 ms.
![phi2.png](../images/phi2.png)
To be able to survive sudden abnormalities, such as garbage collection pauses and
transient network failures the failure detector is configured with a margin, which
you may want to adjust depending on you environment.
This is how the curve looks like for `failure-detector.acceptable-heartbeat-pause` configured to
3 seconds.
![phi3.png](../images/phi3.png)
## Failure Detector Threshold
The `threshold` that is the basis for the calculation is configurable by the
user.
* A low `threshold` is prone to generate many false positives but ensures
a quick detection in the event of a real crash.
* Conversely, a high `threshold` generates fewer mistakes but needs more time to detect actual crashes.
* The default `threshold` is 8 and is appropriate for most situations. However in
cloud environments, such as Amazon EC2, the value could be increased to 12 in
order to account for network issues that sometimes occur on such platforms.

View file

@ -4,8 +4,9 @@
@@@ index @@@ index
* [cluster-specification](../common/cluster.md)
* [cluster](cluster.md) * [cluster](cluster.md)
* [cluster-specification](cluster-specification.md)
* [cluster-membership](cluster-membership.md)
* [distributed-data](distributed-data.md) * [distributed-data](distributed-data.md)
* [cluster-singleton](cluster-singleton.md) * [cluster-singleton](cluster-singleton.md)
* [cluster-sharding](cluster-sharding.md) * [cluster-sharding](cluster-sharding.md)
@ -17,5 +18,6 @@
* [remoting-artery](../remoting-artery.md) * [remoting-artery](../remoting-artery.md)
* [remoting](../remoting.md) * [remoting](../remoting.md)
* [coordination](../coordination.md) * [coordination](../coordination.md)
* [choosing-cluster](choosing-cluster.md)
@@@ @@@

View file

@ -47,6 +47,8 @@ object Paradox {
"index.html", "index.html",
// Page that recommends Alpakka: // Page that recommends Alpakka:
"camel.html", "camel.html",
// Page linked to from many others, but not in a TOC
"typed/failure-detector.html",
// TODO page not linked to // TODO page not linked to
"fault-tolerance-sample.html")) "fault-tolerance-sample.html"))