Add support for delta-CRDT, #21875
* delta GCounter and PNCounter * first stab at delta propagation protocol * send delta in the direct write * possibility to turn off delta propagation * tests * protobuf serializer for DeltaPropagation * documentation
This commit is contained in:
parent
2a9fa234a1
commit
3e7ffd6b96
18 changed files with 2408 additions and 98 deletions
|
|
@ -38,7 +38,9 @@ The ``akka.cluster.ddata.Replicator`` actor provides the API for interacting wit
|
|||
The ``Replicator`` actor must be started on each node in the cluster, or group of nodes tagged
|
||||
with a specific role. It communicates with other ``Replicator`` instances with the same path
|
||||
(without address) that are running on other nodes . For convenience it can be used with the
|
||||
``akka.cluster.ddata.DistributedData`` extension.
|
||||
``akka.cluster.ddata.DistributedData`` extension but it can also be started as an ordinary
|
||||
actor using the ``Replicator.props``. If it is started as an ordinary actor it is important
|
||||
that it is given the same name, started on same path, on all nodes.
|
||||
|
||||
Cluster members with status :ref:`WeaklyUp <weakly_up_java>`, if that feature is enabled,
|
||||
will participate in Distributed Data. This means that the data will be replicated to the
|
||||
|
|
@ -256,14 +258,38 @@ Subscribers will receive ``Replicator.DataDeleted``.
|
|||
where frequent adds and removes are required, you should use a fixed number of top-level data
|
||||
types that support both updates and removals, for example ``ORMap`` or ``ORSet``.
|
||||
|
||||
.. _delta_crdt_java:
|
||||
|
||||
delta-CRDT
|
||||
----------
|
||||
|
||||
`Delta State Replicated Data Types <http://arxiv.org/abs/1603.01529>`_
|
||||
are supported. delta-CRDT is a way to reduce the need for sending the full state
|
||||
for updates. For example adding element ``'c'`` and ``'d'`` to set ``{'a', 'b'}`` would
|
||||
result in sending the delta ``{'c', 'd'}`` and merge that with the state on the
|
||||
receiving side, resulting in set ``{'a', 'b', 'c', 'd'}``.
|
||||
|
||||
Current protocol for replicating the deltas does not support causal consistency.
|
||||
It is only eventually consistent. This means that if elements ``'c'`` and ``'d'`` are
|
||||
added in two separate `Update` operations these deltas may occasionally be propagated
|
||||
to nodes in different order than the causal order of the updates. For this example it
|
||||
can result in that set ``{'a', 'b', 'd'}`` can be seen before element 'c' is seen. Eventually
|
||||
it will be ``{'a', 'b', 'c', 'd'}``. If causal consistency is needed the delta propagation
|
||||
should be disabled with configuration property
|
||||
``akka.cluster.distributed-data.delta-crdt.enabled=off``.
|
||||
|
||||
Note that the full state is occasionally also replicated for delta-CRDTs, for example when
|
||||
new nodes are added to the cluster or when deltas could not be propagated because
|
||||
of network partitions or similar problems.
|
||||
|
||||
Data Types
|
||||
==========
|
||||
|
||||
The data types must be convergent (stateful) CRDTs and implement the ``ReplicatedData`` trait,
|
||||
i.e. they provide a monotonic merge function and the state changes always converge.
|
||||
|
||||
You can use your own custom ``ReplicatedData`` types, and several types are provided
|
||||
by this package, such as:
|
||||
You can use your own custom ``AbstractReplicatedData`` or ``AbstractDeltaReplicatedData`` types,
|
||||
and several types are provided by this package, such as:
|
||||
|
||||
* Counters: ``GCounter``, ``PNCounter``
|
||||
* Sets: ``GSet``, ``ORSet``
|
||||
|
|
@ -287,6 +313,8 @@ The value of the counter is the value of the P counter minus the value of the N
|
|||
|
||||
.. includecode:: code/docs/ddata/DistributedDataDocTest.java#pncounter
|
||||
|
||||
``GCounter`` and ``PNCounter`` have support for :ref:`delta_crdt_java`.
|
||||
|
||||
Several related counters can be managed in a map with the ``PNCounterMap`` data type.
|
||||
When the counters are placed in a ``PNCounterMap`` as opposed to placing them as separate top level
|
||||
values they are guaranteed to be replicated together as one unit, which is sometimes necessary for
|
||||
|
|
@ -406,6 +434,8 @@ removed, but never added again thereafter.
|
|||
|
||||
Data types should be immutable, i.e. "modifying" methods should return a new instance.
|
||||
|
||||
Implement the additional methods of ``AbstractDeltaReplicatedData`` if it has support for delta-CRDT replication.
|
||||
|
||||
Serialization
|
||||
^^^^^^^^^^^^^
|
||||
|
||||
|
|
@ -563,11 +593,11 @@ be able to improve this if needed, but the design is still not intended for bill
|
|||
|
||||
All data is held in memory, which is another reason why it is not intended for *Big Data*.
|
||||
|
||||
When a data entry is changed the full state of that entry is replicated to other nodes. For example,
|
||||
if you add one element to a Set with 100 existing elements, all 101 elements are transferred to
|
||||
other nodes. This means that you cannot have too large data entries, because then the remote message
|
||||
size will be too large. We might be able to make this more efficient by implementing
|
||||
`Efficient State-based CRDTs by Delta-Mutation <http://gsd.di.uminho.pt/members/cbm/ps/delta-crdt-draft16may2014.pdf>`_.
|
||||
When a data entry is changed the full state of that entry may be replicated to other nodes
|
||||
if it doesn't support :ref:`delta_crdt_java`. The full state is also replicated for delta-CRDTs,
|
||||
for example when new nodes are added to the cluster or when deltas could not be propagated because
|
||||
of network partitions or similar problems. This means that you cannot have too large
|
||||
data entries, because then the remote message size will be too large.
|
||||
|
||||
Learn More about CRDTs
|
||||
======================
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue