rememberingEntities with ddata mode, #22154

* one Replicator per configured role * log LMDB directory at startup * clarify the imporantce of the LMDB directory * use more than one key to support many entities
2017-01-18 16:28:24 +01:00 · 2017-01-18 16:28:24 +01:00 · 37679d307e
commit 37679d307e
parent 8fd5b7e53e
23 changed files with 713 additions and 337 deletions
--- a/akka-docs/rst/java/cluster-sharding.rst
+++ b/akka-docs/rst/java/cluster-sharding.rst
@ -185,6 +185,8 @@ unused shards due to the round-trip to the coordinator. Rebalancing of shards ma
 also add latency. This should be considered when designing the application specific
 shard resolution, e.g. to avoid too fine grained shards.

+.. _cluster_sharding_ddata_java:
+
 Distributed Data Mode
 ---------------------

@ -197,19 +199,18 @@ This mode can be enabled by setting configuration property::

    akka.cluster.sharding.state-store-mode = ddata 

-It is using the Distributed Data extension that must be running on all nodes in the cluster.
-Therefore you should add that extension to the configuration to make sure that it is started
-on all nodes::
-
-    akka.extensions += "akka.cluster.ddata.DistributedData"
+It is using its own Distributed Data ``Replicator`` per node role. In this way you can use a subset of
+all nodes for some entity types and another subset for other entity types. Each such replicator has a name
+that contains the node role and therefore the role configuration must be the same on all nodes in the
+cluster, i.e. you can't change the roles when performing a rolling upgrade. 
+ 
+The settings for Distributed Data is configured in the the section 
+``akka.cluster.sharding.distributed-data``. It's not possible to have different 
+``distributed-data`` settings for different sharding entity types.

 You must explicitly add the ``akka-distributed-data-experimental`` dependency to your build if
 you use this mode. It is possible to remove ``akka-persistence`` dependency from a project if it
-is not used in user code and ``remember-entities`` is ``off``.
-Using it together with ``Remember Entities`` shards will be recreated after rebalancing, however will
-not be recreated after a clean cluster start as the Sharding Coordinator state is empty after a clean cluster
-start when using ddata mode. When ``Remember Entities`` is ``on`` Sharding Region always keeps data usig persistence,
-no matter how ``State Store Mode`` is set.
+is not used in user code.

 .. warning::

@ -261,6 +262,13 @@ a ``Passivate`` message must be sent to the parent of the entity actor, otherwis
 entity will be automatically restarted after the entity restart backoff specified in 
 the configuration.

+When :ref:`cluster_sharding_ddata_java` is used the identifiers of the entities are
+stored in :ref:`ddata_durable_java` of Distributed Data. You may want to change the 
+configuration of the akka.cluster.sharding.distributed-data.durable.lmdb.dir`, since
+the default directory contains the remote port of the actor system. If using a dynamically
+assigned port (0) it will be different each time and the previously stored data will not
+be loaded. 
+
 When ``rememberEntities`` is set to false, a ``Shard`` will not automatically restart any entities
 after a rebalance or recovering from a crash. Entities will only be started once the first message
 for that entity has been received in the ``Shard``. Entities will not be restarted if they stop without
--- a/akka-docs/rst/java/distributed-data.rst
+++ b/akka-docs/rst/java/distributed-data.rst
@ -451,7 +451,9 @@ works with any type that has a registered Akka serializer. This is how such an s
 look like for the ``TwoPhaseSet``:

 .. includecode:: code/docs/ddata/japi/protobuf/TwoPhaseSetSerializer2.java#serializer
-  
+
+.. _ddata_durable_java:
+
 Durable Storage
 ---------------

@ -487,6 +489,12 @@ The location of the files for the data is configured with::
  #    a directory.
  akka.cluster.distributed-data.lmdb.dir = "ddata"

+When running in production you may want to configure the directory to a specific
+path (alt 2), since the default directory contains the remote port of the
+actor system to make the name unique. If using a dynamically assigned 
+port (0) it will be different each time and the previously stored data 
+will not be loaded.
+
 Making the data durable has of course a performance cost. By default, each update is flushed
 to disk before the ``UpdateSuccess`` reply is sent. For better performance, but with the risk of losing 
 the last writes if the JVM crashes, you can enable write behind mode. Changes are then accumulated during