=cls #17447 Split Cluster Sharding docs into java/scala

This commit is contained in:
Patrik Nordwall 2015-06-30 11:43:37 +02:00
parent 202e64722c
commit 89f17ddfd0
9 changed files with 338 additions and 88 deletions

View file

@ -152,4 +152,12 @@ You can plug-in your own metrics collector instead of built-in
Look at those two implementations for inspiration.
Custom metrics collector implementation class must be specified in the :ref:`cluster_metrics_configuration_scala`.
Custom metrics collector implementation class must be specified in the
``akka.cluster.metrics.collector.provider`` configuration property.
Configuration
-------------
The Cluster metrics extension can be configured with the following properties:
.. includecode:: ../../../akka-cluster-metrics/src/main/resources/reference.conf

View file

@ -1,4 +1,4 @@
.. _cluster-sharding:
.. _cluster_sharding_scala:
Cluster Sharding
================
@ -22,66 +22,8 @@ the sender to know the location of the destination actor. This is achieved by se
the messages via a ``ShardRegion`` actor provided by this extension, which knows how
to route the message with the entity id to the final destination.
An Example in Java
------------------
This is how an entity actor may look like:
.. includecode:: ../../../akka-cluster-sharding/src/test/java/akka/cluster/sharding/ClusterShardingTest.java#counter-actor
The above actor uses event sourcing and the support provided in ``UntypedPersistentActor`` to store its state.
It does not have to be a persistent actor, but in case of failure or migration of entities between nodes it must be able to recover
its state if it is valuable.
Note how the ``persistenceId`` is defined. You may define it another way, but it must be unique.
When using the sharding extension you are first, typically at system startup on each node
in the cluster, supposed to register the supported entity types with the ``ClusterSharding.start``
method. ``ClusterSharding.start`` gives you the reference which you can pass along.
.. includecode:: ../../../akka-cluster-sharding/src/test/java/akka/cluster/sharding/ClusterShardingTest.java#counter-start
The ``messageExtractor`` defines application specific methods to extract the entity
identifier and the shard identifier from incoming messages.
.. includecode:: ../../../akka-cluster-sharding/src/test/java/akka/cluster/sharding/ClusterShardingTest.java#counter-extractor
This example illustrates two different ways to define the entity identifier in the messages:
* The ``Get`` message includes the identifier itself.
* The ``EntityEnvelope`` holds the identifier, and the actual message that is
sent to the entity actor is wrapped in the envelope.
Note how these two messages types are handled in the ``entityId`` and ``entityMessage`` methods shown above.
The message sent to the entity actor is what ``entityMessage`` returns and that makes it possible to unwrap envelopes
if needed.
A shard is a group of entities that will be managed together. The grouping is defined by the
``extractShardId`` function shown above. For a specific entity identifier the shard identifier must always
be the same. Otherwise the entity actor might accidentally be started in several places at the same time.
Creating a good sharding algorithm is an interesting challenge in itself. Try to produce a uniform distribution,
i.e. same amount of entities in each shard. As a rule of thumb, the number of shards should be a factor ten greater
than the planned maximum number of cluster nodes. Less shards than number of nodes will result in that some nodes
will not host any shards. Too many shards will result in less efficient management of the shards, e.g. rebalancing
overhead, and increased latency because the coordinator is involved in the routing of the first message for each
shard. The sharding algorithm must be the same on all nodes in a running cluster. It can be changed after stopping
all nodes in the cluster.
A simple sharding algorithm that works fine in most cases is to take the absolute value of the ``hashCode`` of
the entity identifier modulo number of shards. As a convenience this is provided by the
``ShardRegion.HashCodeMessageExtractor``.
Messages to the entities are always sent via the local ``ShardRegion``. The ``ShardRegion`` actor reference for a
named entity type is returned by ``ClusterSharding.start`` and it can also be retrieved with ``ClusterSharding.shardRegion``.
The ``ShardRegion`` will lookup the location of the shard for the entity if it does not already know its location. It will
delegate the message to the right node and it will create the entity actor on demand, i.e. when the
first message for a specific entity is delivered.
.. includecode:: ../../../akka-cluster-sharding/src/test/java/akka/cluster/sharding/ClusterShardingTest.java#counter-usage
An Example in Scala
-------------------
An Example
----------
This is how an entity actor may look like:
@ -91,7 +33,8 @@ The above actor uses event sourcing and the support provided in ``PersistentActo
It does not have to be a persistent actor, but in case of failure or migration of entities between nodes it must be able to recover
its state if it is valuable.
Note how the ``persistenceId`` is defined. You may define it another way, but it must be unique.
Note how the ``persistenceId`` is defined. The name of the actor is the entity entity identifier (utf-8 URL-encoded).
You may define it another way, but it must be unique.
When using the sharding extension you are first, typically at system startup on each node
in the cluster, supposed to register the supported entity types with the ``ClusterSharding.start``
@ -126,8 +69,9 @@ overhead, and increased latency because the coordinator is involved in the routi
shard. The sharding algorithm must be the same on all nodes in a running cluster. It can be changed after stopping
all nodes in the cluster.
A simple sharding algorithm that works fine in most cases is to take the ``hashCode`` of the entity identifier modulo
number of shards.
A simple sharding algorithm that works fine in most cases is to take the absolute value of the ``hashCode`` of
the entity identifier modulo number of shards. As a convenience this is provided by the
``ShardRegion.HashCodeMessageExtractor``.
Messages to the entities are always sent via the local ``ShardRegion``. The ``ShardRegion`` actor reference for a
named entity type is returned by ``ClusterSharding.start`` and it can also be retrieved with ``ClusterSharding.shardRegion``.
@ -205,7 +149,7 @@ Thereafter the coordinator will reply to requests for the location of
the shard and thereby allocate a new home for the shard and then buffered messages in the
``ShardRegion`` actors are delivered to the new location. This means that the state of the entities
are not transferred or migrated. If the state of the entities are of importance it should be
persistent (durable), e.g. with ``akka-persistence``, so that it can be recovered at the new
persistent (durable), e.g. with :ref:`persistence-scala`, so that it can be recovered at the new
location.
The logic that decides which shards to rebalance is defined in a pluggable shard
@ -217,7 +161,7 @@ must be to begin the rebalancing. This strategy can be replaced by an applicatio
implementation.
The state of shard locations in the ``ShardCoordinator`` is persistent (durable) with
``akka-persistence`` to survive failures. Since it is running in a cluster ``akka-persistence``
:ref:`persistence-scala` to survive failures. Since it is running in a cluster :ref:`persistence-scala`
must be configured with a distributed journal. When a crashed or unreachable coordinator
node has been removed (via down) from the cluster a new ``ShardCoordinator`` singleton
actor will take over and the state is recovered. During such a failure period shards
@ -228,7 +172,7 @@ As long as a sender uses the same ``ShardRegion`` actor to deliver messages to a
actor the order of the messages is preserved. As long as the buffer limit is not reached
messages are delivered on a best effort basis, with at-most once delivery semantics,
in the same way as ordinary message sending. Reliable end-to-end messaging, with
at-least-once semantics can be added by using ``AtLeastOnceDelivery`` in ``akka-persistence``.
at-least-once semantics can be added by using ``AtLeastOnceDelivery`` in :ref:`persistence-scala`.
Some additional latency is introduced for messages targeted to new or previously
unused shards due to the round-trip to the coordinator. Rebalancing of shards may
@ -275,7 +219,7 @@ for that entity has been received in the ``Shard``. Entities will not be restart
using a ``Passivate``.
Note that the state of the entities themselves will not be restored unless they have been made persistent,
e.g. with ``akka-persistence``.
e.g. with :ref:`persistence-scala`.
Graceful Shutdown
-----------------
@ -288,11 +232,7 @@ triggered by the coordinator. When the shards have been stopped the coordinator
When the ``ShardRegion`` has terminated you probably want to ``leave`` the cluster, and shut down the ``ActorSystem``.
This is how to do it in Java:
.. includecode:: ../../../akka-cluster-sharding/src/test/java/akka/cluster/sharding/ClusterShardingTest.java#graceful-shutdown
This is how to do it in Scala:
This is how to do that:
.. includecode:: ../../../akka-cluster-sharding/src/multi-jvm/scala/akka/cluster/sharding/ClusterShardingGracefulShutdownSpec.scala#graceful-shutdown
@ -316,11 +256,15 @@ maven::
Configuration
-------------
The ``ClusterSharding`` extension can be configured with the following properties:
The ``ClusterSharding`` extension can be configured with the following properties. These configuration
properties are read by the ``ClusterShardingSettings`` when created with a ``ActorSystem`` parameter.
It is also possible to amend the ``ClusterShardingSettings`` or create it from another config section
with the same layout as below. ``ClusterShardingSettings`` is a parameter to the ``start`` method of
the ``ClusterSharding`` extension, i.e. each each entity type can be configured with different settings
if needed.
.. includecode:: ../../../akka-cluster-sharding/src/main/resources/reference.conf#sharding-ext-config
Custom shard allocation strategy can be defined in an optional parameter to
``ClusterSharding.start``. See the API documentation of ``ShardAllocationStrategy``
(Scala) or ``AbstractShardAllocationStrategy`` (Java) for details of how to implement a custom
shard allocation strategy.
``ClusterSharding.start``. See the API documentation of ``ShardAllocationStrategy`` for details of
how to implement a custom shard allocation strategy.

View file

@ -316,7 +316,7 @@ Distributes actors across several nodes in the cluster and supports interaction
with the actors using their logical identifier, but without having to care about
their physical location in the cluster.
See :ref:`cluster-sharding` in the contrib module.
See :ref:`cluster_sharding_scala`
Distributed Publish Subscribe
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -336,6 +336,14 @@ actor is running.
See :ref:`cluster-client` in the contrib module.
Distributed Data
^^^^^^^^^^^^^^^^
*Akka Distributed Data* is useful when you need to share data between nodes in an
Akka Cluster. The data is accessed with an actor providing a key-value store like API.
See :ref:`distributed_data_scala`.
Failure Detector
^^^^^^^^^^^^^^^^