521 lines
No EOL
25 KiB
ReStructuredText
521 lines
No EOL
25 KiB
ReStructuredText
|
||
.. _distributed_data_scala:
|
||
|
||
##################
|
||
Distributed Data
|
||
##################
|
||
|
||
*Akka Distributed Data* is useful when you need to share data between nodes in an
|
||
Akka Cluster. The data is accessed with an actor providing a key-value store like API.
|
||
The keys are unique identifiers with type information of the data values. The values
|
||
are *Conflict Free Replicated Data Types* (CRDTs).
|
||
|
||
All data entries are spread to all nodes, or nodes with a certain role, in the cluster
|
||
via direct replication and gossip based dissemination. You have fine grained control
|
||
of the consistency level for reads and writes.
|
||
|
||
The nature CRDTs makes it possible to perform updates from any node without coordination.
|
||
Concurrent updates from different nodes will automatically be resolved by the monotonic
|
||
merge function, which all data types must provide. The state changes always converge.
|
||
Several useful data types for counters, sets, maps and registers are provided and
|
||
you can also implement your own custom data types.
|
||
|
||
It is eventually consistent and geared toward providing high read and write availability
|
||
(partition tolerance), with low latency. Note that in an eventually consistent system a read may return an
|
||
out-of-date value.
|
||
|
||
.. warning::
|
||
|
||
This module is marked as **“experimental”** as of its introduction in Akka 2.4.0. We will continue to
|
||
improve this API based on our users’ feedback, which implies that while we try to keep incompatible
|
||
changes to a minimum the binary compatibility guarantee for maintenance releases does not apply to the
|
||
contents of the ``akka.persistence`` package.
|
||
|
||
Using the Replicator
|
||
====================
|
||
|
||
The ``akka.cluster.ddata.Replicator`` actor provides the API for interacting with the data.
|
||
The ``Replicator`` actor must be started on each node in the cluster, or group of nodes tagged
|
||
with a specific role. It communicates with other ``Replicator`` instances with the same path
|
||
(without address) that are running on other nodes . For convenience it can be used with the
|
||
``akka.cluster.ddata.DistributedData`` extension.
|
||
|
||
Cluster members with status :ref:`WeaklyUp <weakly_up_scala>`, if that feature is enabled,
|
||
will currently not participate in Distributed Data, but that is something that should be possible to
|
||
add in a future release.
|
||
|
||
Below is an example of an actor that schedules tick messages to itself and for each tick
|
||
adds or removes elements from a ``ORSet`` (observed-remove set). It also subscribes to
|
||
changes of this.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#data-bot
|
||
|
||
.. _replicator_update_scala:
|
||
|
||
Update
|
||
------
|
||
|
||
To modify and replicate a data value you send a ``Replicator.Update`` message to the local
|
||
``Replicator``.
|
||
|
||
The current data value for the ``key`` of the ``Update`` is passed as parameter to the ``modify``
|
||
function of the ``Update``. The function is supposed to return the new value of the data, which
|
||
will then be replicated according to the given consistency level.
|
||
|
||
The ``modify`` function is called by the ``Replicator`` actor and must therefore be a pure
|
||
function that only uses the data parameter and stable fields from enclosing scope. It must
|
||
for example not access ``sender()`` reference of an enclosing actor.
|
||
|
||
``Update`` is intended to only be sent from an actor running in same local ``ActorSystem`` as
|
||
* the `Replicator`, because the `modify` function is typically not serializable.
|
||
|
||
You supply a write consistency level which has the following meaning:
|
||
|
||
* ``WriteLocal`` the value will immediately only be written to the local replica,
|
||
and later disseminated with gossip
|
||
* ``WriteTo(n)`` the value will immediately be written to at least ``n`` replicas,
|
||
including the local replica
|
||
* ``WriteMajority`` the value will immediately be written to a majority of replicas, i.e.
|
||
at least **N/2 + 1** replicas, where N is the number of nodes in the cluster
|
||
(or cluster role group)
|
||
* ``WriteAll`` the value will immediately be written to all nodes in the cluster
|
||
(or all nodes in the cluster role group)
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#update
|
||
|
||
As reply of the ``Update`` a ``Replicator.UpdateSuccess`` is sent to the sender of the
|
||
``Update`` if the value was successfully replicated according to the supplied consistency
|
||
level within the supplied timeout. Otherwise a ``Replicator.UpdateFailure`` subclass is
|
||
sent back. Note that a ``Replicator.UpdateTimeout`` reply does not mean that the update completely failed
|
||
or was rolled back. It may still have been replicated to some nodes, and will eventually
|
||
be replicated to all nodes with the gossip protocol.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#update-response1
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#update-response2
|
||
|
||
You will always see your own writes. For example if you send two ``Update`` messages
|
||
changing the value of the same ``key``, the ``modify`` function of the second message will
|
||
see the change that was performed by the first ``Update`` message.
|
||
|
||
In the ``Update`` message you can pass an optional request context, which the ``Replicator``
|
||
does not care about, but is included in the reply messages. This is a convenient
|
||
way to pass contextual information (e.g. original sender) without having to use ``ask``
|
||
or maintain local correlation data structures.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#update-request-context
|
||
|
||
.. _replicator_get_scala:
|
||
|
||
Get
|
||
---
|
||
|
||
To retrieve the current value of a data you send ``Replicator.Get`` message to the
|
||
``Replicator``. You supply a consistency level which has the following meaning:
|
||
|
||
* ``ReadLocal`` the value will only be read from the local replica
|
||
* ``ReadFrom(n)`` the value will be read and merged from ``n`` replicas,
|
||
including the local replica
|
||
* ``ReadMajority`` the value will be read and merged from a majority of replicas, i.e.
|
||
at least **N/2 + 1** replicas, where N is the number of nodes in the cluster
|
||
(or cluster role group)
|
||
* ``ReadAll`` the value will be read and merged from all nodes in the cluster
|
||
(or all nodes in the cluster role group)
|
||
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#get
|
||
|
||
As reply of the ``Get`` a ``Replicator.GetSuccess`` is sent to the sender of the
|
||
``Get`` if the value was successfully retrieved according to the supplied consistency
|
||
level within the supplied timeout. Otherwise a ``Replicator.GetFailure`` is sent.
|
||
If the key does not exist the reply will be ``Replicator.NotFound``.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#get-response1
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#get-response2
|
||
|
||
You will always read your own writes. For example if you send a ``Update`` message
|
||
followed by a ``Get`` of the same ``key`` the ``Get`` will retrieve the change that was
|
||
performed by the preceding ``Update`` message. However, the order of the reply messages are
|
||
not defined, i.e. in the previous example you may receive the ``GetSuccess`` before
|
||
the ``UpdateSuccess``.
|
||
|
||
In the ``Get`` message you can pass an optional request context in the same way as for the
|
||
``Update`` message, described above. For example the original sender can be passed and replied
|
||
to after receiving and transforming ``GetSuccess``.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#get-request-context
|
||
|
||
Consistency
|
||
-----------
|
||
|
||
The consistency level that is supplied in the :ref:`replicator_update_scala` and :ref:`replicator_get_scala`
|
||
specifies per request how many replicas that must respond successfully to a write and read request.
|
||
|
||
For low latency reads you use ``ReadLocal`` with the risk of retrieving stale data, i.e. updates
|
||
from other nodes might not be visible yet.
|
||
|
||
When using ``WriteLocal`` the update is only written to the local replica and then disseminated
|
||
in the background with the gossip protocol, which can take few seconds to spread to all nodes.
|
||
|
||
``WriteAll`` and ``ReadAll`` is the strongest consistency level, but also the slowest and with
|
||
lowest availability. For example, it is enough that one node is unavailable for a ``Get`` request
|
||
and you will not receive the value.
|
||
|
||
If consistency is important, you can ensure that a read always reflects the most recent
|
||
write by using the following formula::
|
||
|
||
(nodes_written + nodes_read) > N
|
||
|
||
where N is the total number of nodes in the cluster, or the number of nodes with the role that is
|
||
used for the ``Replicator``.
|
||
|
||
For example, in a 7 node cluster this these consistency properties are achieved by writing to 4 nodes
|
||
and reading from 4 nodes, or writing to 5 nodes and reading from 3 nodes.
|
||
|
||
By combining ``WriteMajority`` and ``ReadMajority`` levels a read always reflects the most recent write.
|
||
The ``Replicator`` writes and reads to a majority of replicas, i.e. **N / 2 + 1**. For example,
|
||
in a 5 node cluster it writes to 3 nodes and reads from 3 nodes. In a 6 node cluster it writes
|
||
to 4 nodes and reads from 4 nodes.
|
||
|
||
Here is an example of using ``WriteMajority`` and ``ReadMajority``:
|
||
|
||
.. includecode:: ../../../akka-samples/akka-sample-distributed-data-scala/src/main/scala/sample/distributeddata/ShoppingCart.scala#read-write-majority
|
||
|
||
.. includecode:: ../../../akka-samples/akka-sample-distributed-data-scala/src/main/scala/sample/distributeddata/ShoppingCart.scala#get-cart
|
||
|
||
.. includecode:: ../../../akka-samples/akka-sample-distributed-data-scala/src/main/scala/sample/distributeddata/ShoppingCart.scala#add-item
|
||
|
||
In some rare cases, when performing an ``Update`` it is needed to first try to fetch latest data from
|
||
other nodes. That can be done by first sending a ``Get`` with ``ReadMajority`` and then continue with
|
||
the ``Update`` when the ``GetSuccess``, ``GetFailure`` or ``NotFound`` reply is received. This might be
|
||
needed when you need to base a decision on latest information or when removing entries from ``ORSet``
|
||
or ``ORMap``. If an entry is added to an ``ORSet`` or ``ORMap`` from one node and removed from another
|
||
node the entry will only be removed if the added entry is visible on the node where the removal is
|
||
performed (hence the name observed-removed set).
|
||
|
||
The following example illustrates how to do that:
|
||
|
||
.. includecode:: ../../../akka-samples/akka-sample-distributed-data-scala/src/main/scala/sample/distributeddata/ShoppingCart.scala#remove-item
|
||
|
||
.. warning::
|
||
|
||
*Caveat:* Even if you use ``WriteMajority`` and ``ReadMajority`` there is small risk that you may
|
||
read stale data if the cluster membership has changed between the ``Update`` and the ``Get``.
|
||
For example, in cluster of 5 nodes when you ``Update`` and that change is written to 3 nodes:
|
||
n1, n2, n3. Then 2 more nodes are added and a ``Get`` request is reading from 4 nodes, which
|
||
happens to be n4, n5, n6, n7, i.e. the value on n1, n2, n3 is not seen in the response of the
|
||
``Get`` request.
|
||
|
||
Subscribe
|
||
---------
|
||
|
||
You may also register interest in change notifications by sending ``Replicator.Subscribe``
|
||
message to the ``Replicator``. It will send ``Replicator.Changed`` messages to the registered
|
||
subscriber when the data for the subscribed key is updated. Subscribers will be notified
|
||
periodically with the configured ``notify-subscribers-interval``, and it is also possible to
|
||
send an explicit ``Replicator.FlushChanges`` message to the ``Replicator`` to notify the subscribers
|
||
immediately.
|
||
|
||
The subscriber is automatically removed if the subscriber is terminated. A subscriber can
|
||
also be deregistered with the ``Replicator.Unsubscribe`` message.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#subscribe
|
||
|
||
Delete
|
||
------
|
||
|
||
A data entry can be deleted by sending a ``Replicator.Delete`` message to the local
|
||
local ``Replicator``. As reply of the ``Delete`` a ``Replicator.DeleteSuccess`` is sent to
|
||
the sender of the ``Delete`` if the value was successfully deleted according to the supplied
|
||
consistency level within the supplied timeout. Otherwise a ``Replicator.ReplicationDeleteFailure``
|
||
is sent. Note that ``ReplicationDeleteFailure`` does not mean that the delete completely failed or
|
||
was rolled back. It may still have been replicated to some nodes, and may eventually be replicated
|
||
to all nodes.
|
||
|
||
A deleted key cannot be reused again, but it is still recommended to delete unused
|
||
data entries because that reduces the replication overhead when new nodes join the cluster.
|
||
Subsequent ``Delete``, ``Update`` and ``Get`` requests will be replied with ``Replicator.DataDeleted``.
|
||
Subscribers will receive ``Replicator.DataDeleted``.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#delete
|
||
|
||
Data Types
|
||
==========
|
||
|
||
The data types must be convergent (stateful) CRDTs and implement the ``ReplicatedData`` trait,
|
||
i.e. they provide a monotonic merge function and the state changes always converge.
|
||
|
||
You can use your own custom ``ReplicatedData`` types, and several types are provided
|
||
by this package, such as:
|
||
|
||
* Counters: ``GCounter``, ``PNCounter``
|
||
* Sets: ``GSet``, ``ORSet``
|
||
* Maps: ``ORMap``, ``ORMultiMap``, ``LWWMap``, ``PNCounterMap``
|
||
* Registers: ``LWWRegister``, ``Flag``
|
||
|
||
Counters
|
||
--------
|
||
|
||
``GCounter`` is a "grow only counter". It only supports increments, no decrements.
|
||
|
||
It works in a similar way as a vector clock. It keeps track of one counter per node and the total
|
||
value is the sum of these counters. The ``merge`` is implemented by taking the maximum count for
|
||
each node.
|
||
|
||
If you need both increments and decrements you can use the ``PNCounter`` (positive/negative counter).
|
||
|
||
It is tracking the increments (P) separate from the decrements (N). Both P and N are represented
|
||
as two internal ``GCounter``. Merge is handled by merging the internal P and N counters.
|
||
The value of the counter is the value of the P counter minus the value of the N counter.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#pncounter
|
||
|
||
Several related counters can be managed in a map with the ``PNCounterMap`` data type.
|
||
When the counters are placed in a ``PNCounterMap`` as opposed to placing them as separate top level
|
||
values they are guaranteed to be replicated together as one unit, which is sometimes necessary for
|
||
related data.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#pncountermap
|
||
|
||
Sets
|
||
----
|
||
|
||
If you only need to add elements to a set and not remove elements the ``GSet`` (grow-only set) is
|
||
the data type to use. The elements can be any type of values that can be serialized.
|
||
Merge is simply the union of the two sets.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#gset
|
||
|
||
If you need add and remove operations you should use the ``ORSet`` (observed-remove set).
|
||
Elements can be added and removed any number of times. If an element is concurrently added and
|
||
removed, the add will win. You cannot remove an element that you have not seen.
|
||
|
||
The ``ORSet`` has a version vector that is incremented when an element is added to the set.
|
||
The version for the node that added the element is also tracked for each element in a so
|
||
called "birth dot". The version vector and the dots are used by the ``merge`` function to
|
||
track causality of the operations and resolve concurrent updates.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#orset
|
||
|
||
Maps
|
||
----
|
||
|
||
``ORMap`` (observed-remove map) is a map with ``String`` keys and the values are ``ReplicatedData``
|
||
types themselves. It supports add, remove and delete any number of times for a map entry.
|
||
|
||
If an entry is concurrently added and removed, the add will win. You cannot remove an entry that
|
||
you have not seen. This is the same semantics as for the ``ORSet``.
|
||
|
||
If an entry is concurrently updated to different values the values will be merged, hence the
|
||
requirement that the values must be ``ReplicatedData`` types.
|
||
|
||
It is rather inconvenient to use the ``ORMap`` directly since it does not expose specific types
|
||
of the values. The ``ORMap`` is intended as a low level tool for building more specific maps,
|
||
such as the following specialized maps.
|
||
|
||
``ORMultiMap`` (observed-remove multi-map) is a multi-map implementation that wraps an
|
||
``ORMap`` with an ``ORSet`` for the map's value.
|
||
|
||
``PNCounterMap`` (positive negative counter map) is a map of named counters. It is a specialized
|
||
``ORMap`` with ``PNCounter`` values.
|
||
|
||
``LWWMap`` (last writer wins map) is a specialized ``ORMap`` with ``LWWRegister`` (last writer wins register)
|
||
values.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#ormultimap
|
||
|
||
Note that ``LWWRegister`` and therefore ``LWWMap`` relies on synchronized clocks and should only be used
|
||
when the choice of value is not important for concurrent updates occurring within the clock skew.
|
||
|
||
Instead of using timestamps based on ``System.currentTimeMillis()`` time it is possible to
|
||
use a timestamp value based on something else, for example an increasing version number
|
||
from a database record that is used for optimistic concurrency control.
|
||
|
||
When a data entry is changed the full state of that entry is replicated to other nodes, i.e.
|
||
when you update a map the whole map is replicated. Therefore, instead of using one ``ORMap``
|
||
with 1000 elements it is more efficient to split that up in 10 top level ``ORMap`` entries
|
||
with 100 elements each. Top level entries are replicated individually, which has the
|
||
trade-off that different entries may not be replicated at the same time and you may see
|
||
inconsistencies between related entries. Separate top level entries cannot be updated atomically
|
||
together.
|
||
|
||
Flags and Registers
|
||
-------------------
|
||
|
||
``Flag`` is a data type for a boolean value that is initialized to ``false`` and can be switched
|
||
to ``true``. Thereafter it cannot be changed. ``true`` wins over ``false`` in merge.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#flag
|
||
|
||
``LWWRegister`` (last writer wins register) can hold any (serializable) value.
|
||
|
||
Merge of a ``LWWRegister`` takes the register with highest timestamp. Note that this
|
||
relies on synchronized clocks. `LWWRegister` should only be used when the choice of
|
||
value is not important for concurrent updates occurring within the clock skew.
|
||
|
||
Merge takes the register updated by the node with lowest address (``UniqueAddress`` is ordered)
|
||
if the timestamps are exactly the same.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#lwwregister
|
||
|
||
Instead of using timestamps based on ``System.currentTimeMillis()`` time it is possible to
|
||
use a timestamp value based on something else, for example an increasing version number
|
||
from a database record that is used for optimistic concurrency control.
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#lwwregister-custom-clock
|
||
|
||
For first-write-wins semantics you can use the ``LWWRegister#reverseClock`` instead of the
|
||
``LWWRegister#defaultClock``.
|
||
|
||
Custom Data Type
|
||
----------------
|
||
|
||
You can rather easily implement your own data types. The only requirement is that it implements
|
||
the ``merge`` function of the ``ReplicatedData`` trait.
|
||
|
||
A nice property of stateful CRDTs is that they typically compose nicely, i.e. you can combine several
|
||
smaller data types to build richer data structures. For example, the ``PNCounter`` is composed of
|
||
two internal ``GCounter`` instances to keep track of increments and decrements separately.
|
||
|
||
Here is s simple implementation of a custom ``TwoPhaseSet`` that is using two internal ``GSet`` types
|
||
to keep track of addition and removals. A ``TwoPhaseSet`` is a set where an element may be added and
|
||
removed, but never added again thereafter.
|
||
|
||
.. includecode:: code/docs/ddata/TwoPhaseSet.scala#twophaseset
|
||
|
||
Data types should be immutable, i.e. "modifying" methods should return a new instance.
|
||
|
||
Serialization
|
||
^^^^^^^^^^^^^
|
||
|
||
The data types must be serializable with an :ref:`Akka Serializer <serialization-scala>`.
|
||
It is highly recommended that you implement efficient serialization with Protobuf or similar
|
||
for your custom data types. The built in data types are marked with ``ReplicatedDataSerialization``
|
||
and serialized with ``akka.cluster.ddata.protobuf.ReplicatedDataSerializer``.
|
||
|
||
Serialization of the data types are used in remote messages and also for creating message
|
||
digests (SHA-1) to detect changes. Therefore it is important that the serialization is efficient
|
||
and produce the same bytes for the same content. For example sets and maps should be sorted
|
||
deterministically in the serialization.
|
||
|
||
This is a protobuf representation of the above ``TwoPhaseSet``:
|
||
|
||
.. includecode:: ../../src/main/protobuf/TwoPhaseSetMessages.proto#twophaseset
|
||
|
||
The serializer for the ``TwoPhaseSet``:
|
||
|
||
.. includecode:: code/docs/ddata/protobuf/TwoPhaseSetSerializer.scala#serializer
|
||
|
||
Note that the elements of the sets are sorted so the SHA-1 digests are the same
|
||
for the same elements.
|
||
|
||
You register the serializer in configuration:
|
||
|
||
.. includecode:: code/docs/ddata/DistributedDataDocSpec.scala#serializer-config
|
||
|
||
Using compression can sometimes be a good idea to reduce the data size. Gzip compression is
|
||
provided by the ``akka.cluster.ddata.protobuf.SerializationSupport`` trait:
|
||
|
||
.. includecode:: code/docs/ddata/protobuf/TwoPhaseSetSerializer.scala#compression
|
||
|
||
The two embedded ``GSet`` can be serialized as illustrated above, but in general when composing
|
||
new data types from the existing built in types it is better to make use of the existing
|
||
serializer for those types. This can be done by declaring those as bytes fields in protobuf:
|
||
|
||
.. includecode:: ../../src/main/protobuf/TwoPhaseSetMessages.proto#twophaseset2
|
||
|
||
and use the methods ``otherMessageToProto`` and ``otherMessageFromBinary`` that are provided
|
||
by the ``SerializationSupport`` trait to serialize and deserialize the ``GSet`` instances. This
|
||
works with any type that has a registered Akka serializer. This is how such an serializer would
|
||
look like for the ``TwoPhaseSet``:
|
||
|
||
.. includecode:: code/docs/ddata/protobuf/TwoPhaseSetSerializer2.scala#serializer
|
||
|
||
|
||
CRDT Garbage
|
||
------------
|
||
|
||
One thing that can be problematic with CRDTs is that some data types accumulate history (garbage).
|
||
For example a ``GCounter`` keeps track of one counter per node. If a ``GCounter`` has been updated
|
||
from one node it will associate the identifier of that node forever. That can become a problem
|
||
for long running systems with many cluster nodes being added and removed. To solve this problem
|
||
the ``Replicator`` performs pruning of data associated with nodes that have been removed from the
|
||
cluster. Data types that need pruning have to implement the ``RemovedNodePruning`` trait.
|
||
|
||
Samples
|
||
=======
|
||
|
||
Several interesting samples are included and described in the `Lightbend Activator <http://www.lightbend.com/platform/getstarted>`_
|
||
tutorial named `Akka Distributed Data Samples with Scala <http://www.lightbend.com/activator/template/akka-sample-distributed-data-scala>`_.
|
||
|
||
* Low Latency Voting Service
|
||
* Highly Available Shopping Cart
|
||
* Distributed Service Registry
|
||
* Replicated Cache
|
||
* Replicated Metrics
|
||
|
||
Limitations
|
||
===========
|
||
|
||
There are some limitations that you should be aware of.
|
||
|
||
CRDTs cannot be used for all types of problems, and eventual consistency does not fit
|
||
all domains. Sometimes you need strong consistency.
|
||
|
||
It is not intended for *Big Data*. The number of top level entries should not exceed 100000.
|
||
When a new node is added to the cluster all these entries are transferred (gossiped) to the
|
||
new node. The entries are split up in chunks and all existing nodes collaborate in the gossip,
|
||
but it will take a while (tens of seconds) to transfer all entries and this means that you
|
||
cannot have too many top level entries. The current recommended limit is 100000. We will
|
||
be able to improve this if needed, but the design is still not intended for billions of entries.
|
||
|
||
All data is held in memory, which is another reason why it is not intended for *Big Data*.
|
||
|
||
When a data entry is changed the full state of that entry is replicated to other nodes. For example,
|
||
if you add one element to a Set with 100 existing elements, all 101 elements are transferred to
|
||
other nodes. This means that you cannot have too large data entries, because then the remote message
|
||
size will be too large. We might be able to make this more efficient by implementing
|
||
`Efficient State-based CRDTs by Delta-Mutation <http://gsd.di.uminho.pt/members/cbm/ps/delta-crdt-draft16may2014.pdf>`_.
|
||
|
||
The data is only kept in memory. It is redundant since it is replicated to other nodes
|
||
in the cluster, but if you stop all nodes the data is lost, unless you have saved it
|
||
elsewhere. Making the data durable is a possible future feature, but even if we implement that
|
||
it is not intended to be a full featured database.
|
||
|
||
Learn More about CRDTs
|
||
======================
|
||
|
||
* `The Final Causal Frontier <http://www.ustream.tv/recorded/61448875>`_
|
||
talk by Sean Cribbs
|
||
* `Eventually Consistent Data Structures <https://vimeo.com/43903960>`_
|
||
talk by Sean Cribbs
|
||
* `Strong Eventual Consistency and Conflict-free Replicated Data Types <http://research.microsoft.com/apps/video/default.aspx?id=153540&r=1>`_
|
||
talk by Mark Shapiro
|
||
* `A comprehensive study of Convergent and Commutative Replicated Data Types <http://hal.upmc.fr/file/index/docid/555588/filename/techreport.pdf>`_
|
||
paper by Mark Shapiro et. al.
|
||
|
||
Dependencies
|
||
------------
|
||
|
||
To use Distributed Data you must add the following dependency in your project.
|
||
|
||
sbt::
|
||
|
||
"com.typesafe.akka" %% "akka-distributed-data-experimental" % "@version@" @crossString@
|
||
|
||
maven::
|
||
|
||
<dependency>
|
||
<groupId>com.typesafe.akka</groupId>
|
||
<artifactId>akka-distributed-data-experimental_@binVersion@</artifactId>
|
||
<version>@version@</version>
|
||
</dependency>
|
||
|
||
Configuration
|
||
=============
|
||
|
||
The ``DistributedData`` extension can be configured with the following properties:
|
||
|
||
.. includecode:: ../../../akka-distributed-data/src/main/resources/reference.conf#distributed-data
|
||
|