292 lines
10 KiB
HTML
292 lines
10 KiB
HTML
<!-- <html> -->
|
|
<head>
|
|
<title>Akka Distributed Data Samples with Scala</title>
|
|
</head>
|
|
|
|
<body>
|
|
|
|
<div>
|
|
<p>
|
|
This tutorial contains 5 samples illustrating how to use
|
|
<a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html" target="_blank">Akka Distributed Data</a>.
|
|
</p>
|
|
<ul>
|
|
<li>Low Latency Voting Service</li>
|
|
<li>Highly Available Shopping Cart</li>
|
|
<li>Distributed Service Registry</li>
|
|
<li>Replicated Cache</li>
|
|
<li>Replicated Metrics</li>
|
|
</ul>
|
|
|
|
<p>
|
|
<b>Akka Distributed Data</b> is useful when you need to share data between nodes in an
|
|
Akka Cluster. The data is accessed with an actor providing a key-value store like API.
|
|
The keys are unique identifiers with type information of the data values. The values
|
|
are <i>Conflict Free Replicated Data Types</i> (CRDTs).
|
|
</p>
|
|
|
|
<p>
|
|
All data entries are spread to all nodes, or nodes with a certain role, in the cluster
|
|
via direct replication and gossip based dissemination. You have fine grained control
|
|
of the consistency level for reads and writes.
|
|
</p>
|
|
|
|
<p>
|
|
The nature CRDTs makes it possible to perform updates from any node without coordination.
|
|
Concurrent updates from different nodes will automatically be resolved by the monotonic
|
|
merge function, which all data types must provide. The state changes always converge.
|
|
Several useful data types for counters, sets, maps and registers are provided and
|
|
you can also implement your own custom data types.
|
|
</p>
|
|
|
|
<p>
|
|
It is eventually consistent and geared toward providing high read and write availability
|
|
(partition tolerance), with low latency. Note that in an eventually consistent system a read may return an
|
|
out-of-date value.
|
|
</p>
|
|
|
|
<p>
|
|
Note that there are some
|
|
<a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html#Limitations" target="_blank">Limitations</a>
|
|
that you should be aware of. For example, Akka Distributed Data is not intended for <i>Big Data</i>.
|
|
</p>
|
|
|
|
</div>
|
|
|
|
<div>
|
|
|
|
<h2>Low Latency Voting Service</h2>
|
|
|
|
<p>
|
|
Distributed Data is great for low latency services, since you can update or get data from the local replica
|
|
without immediate communication with other nodes.
|
|
</p>
|
|
|
|
<p>
|
|
Open <a href="#code/src/main/scala/sample/distributeddata/VotingService.scala" class="shortcut">VotingService.scala</a>.
|
|
</p>
|
|
|
|
<p>
|
|
<code>VotingService</code> is an actor for low latency counting of votes on several cluster nodes and aggregation
|
|
of the grand total number of votes. The actor is started on each cluster node. First it expects an
|
|
<code>Open</code> message on one or several nodes. After that the counting can begin. The open
|
|
signal is immediately replicated to all nodes with a boolean
|
|
<a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html#Flags_and_Registers" target="_blank">Flag</a>.
|
|
Note <code>WriteAll</code>.
|
|
</p>
|
|
|
|
<pre><code>
|
|
replicator ! Update(OpenedKey, Flag(), WriteAll(5.seconds))(_.switchOn)
|
|
</code></pre>
|
|
|
|
<p>
|
|
The actor is subscribing to changes of the <code>OpenedKey</code> and other instances of this actor,
|
|
also on other nodes, will be notified when the flag is changed.
|
|
</p>
|
|
|
|
<pre><code>
|
|
replicator ! Subscribe(OpenedKey, self)
|
|
</code></pre>
|
|
|
|
<pre><code>
|
|
case c @ Changed(OpenedKey) if c.get(OpenedKey).enabled
|
|
</code></pre>
|
|
|
|
<p>
|
|
The counters are kept in a
|
|
<a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html#Counters" target="_blank">PNCounterMap</a>
|
|
and updated with:
|
|
</p>
|
|
|
|
<pre><code>
|
|
val update = Update(CountersKey, PNCounterMap(), WriteLocal, request = Some(v)) {
|
|
_.increment(participant, 1)
|
|
}
|
|
replicator ! update
|
|
</code></pre>
|
|
|
|
<p>
|
|
Incrementing the counter is very fast, since it only involves communication with the local
|
|
<code>Replicator</code> actor. Note <code>WriteLocal</code>. Those updates are also spread
|
|
to other nodes, but that is performed in the background.
|
|
</p>
|
|
|
|
<p>
|
|
The total number of votes is retrieved with:
|
|
</p>
|
|
|
|
<pre><code>
|
|
case GetVotes ⇒
|
|
replicator ! Get(CountersKey, ReadAll(3.seconds), Some(GetVotesReq(sender())))
|
|
|
|
case g @ GetSuccess(CountersKey, Some(GetVotesReq(replyTo))) ⇒
|
|
val data = g.get(CountersKey)
|
|
replyTo ! Votes(data.entries, open)
|
|
</code></pre>
|
|
|
|
<p>
|
|
The multi-node test for the <code>VotingService</code> can be found in
|
|
<a href="#code/src/multi-jvm/scala/sample/distributeddata/VotingServiceSpec.scala" class="shortcut">VotingServiceSpec.scala</a>.
|
|
</p>
|
|
|
|
<p>
|
|
Read the
|
|
<a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html#Using_the_Replicator" target="_blank">Using the Replicator</a>
|
|
documentation for more details of how to use <code>Get</code>, <code>Update</code>, and <code>Subscribe</code>.
|
|
</p>
|
|
|
|
</div>
|
|
|
|
<div>
|
|
<h2>Highly Available Shopping Cart</h2>
|
|
|
|
<p>
|
|
Distributed Data is great for highly available services, since it is possible to perform
|
|
updates to the local node (or currently available nodes) during a network partition.
|
|
</p>
|
|
|
|
<p>
|
|
Open <a href="#code/src/main/scala/sample/distributeddata/ShoppingCart.scala" class="shortcut">ShoppingCart.scala</a>.
|
|
</p>
|
|
|
|
<p>
|
|
<code>ShoppingCart</code> is an actor that holds the selected items to buy for a user.
|
|
The actor instance for a specific user may be started where ever needed in the cluster, i.e. several
|
|
instances may be started on different nodes and used at the same time.
|
|
</p>
|
|
|
|
<p>
|
|
Each product in the cart is represented by a <code>LineItem</code> and all items in the cart
|
|
is collected in a <a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html#Maps" target="_blank">LWWMap</a>.
|
|
</p>
|
|
|
|
<p>
|
|
The actor handles the commands <code>GetCart</code>, <code>AddItem</code> and <code>RemoveItem</code>.
|
|
To get the latest updates in case the same shopping cart is used from several nodes it is using
|
|
consistency level of <code>ReadMajority</code> and <code>WriteMajority</code>, but that is only
|
|
done to reduce the risk of seeing old data. If such reads and writes cannot be completed due to a
|
|
network partition it falls back to reading/writing from the local replica (see <code>GetFailure</code>).
|
|
Local reads and writes will always be successful and when the network partition heals the updated
|
|
shopping carts will be be disseminated by the
|
|
<a href="https://en.wikipedia.org/wiki/Gossip_protocol" target="_blank">gossip protocol</a>
|
|
and the <code>LWWMap</code> CRDTs are merged, i.e. it is a highly available shopping cart.
|
|
</p>
|
|
|
|
<p>
|
|
The multi-node test for the <code>ShoppingCart</code> can be found in
|
|
<a href="#code/src/multi-jvm/scala/sample/distributeddata/ShoppingCartSpec.scala" class="shortcut">ShoppingCartSpec.scala</a>.
|
|
</p>
|
|
|
|
<p>
|
|
Read the
|
|
<a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html#Consistency" target="_blank">Consistency</a>
|
|
section in the documentation to understand the consistency considerations.
|
|
</p>
|
|
|
|
</div>
|
|
|
|
<div>
|
|
<h2>Distributed Service Registry</h2>
|
|
|
|
<p>
|
|
Have you ever had the need to lookup actors by name in an Akka Cluster?
|
|
This example illustrates how you could implement such a registry. It is probably not
|
|
feature complete, but should be a good starting point.
|
|
</p>
|
|
|
|
<p>
|
|
Open <a href="#code/src/main/scala/sample/distributeddata/ServiceRegistry.scala" class="shortcut">ServiceRegistry.scala</a>.
|
|
</p>
|
|
|
|
<p>
|
|
<code>ServiceRegistry</code> is an actor that is started on each node in the cluster.
|
|
It supports two basic commands:
|
|
</p>
|
|
<ul>
|
|
<li><code>Register</code> to bind an <code>ActorRef</code> to a name,
|
|
several actors can be bound to the same name</li>
|
|
<li><code>Lookup</code> get currently bound services of a given name</li>
|
|
</ul>
|
|
|
|
<p>
|
|
For each named service it is using an
|
|
<a href="http://doc.akka.io/docs/akka/2.4-SNAPSHOT/scala/distributed-data.html#Sets" target="_blank">ORSet</a>.
|
|
Here we are using top level <code>ORSet</code> entries. An alternative would have been to use a
|
|
<code>ORMultiMap</code> holding all services. That would have a disadvantage if we have many services.
|
|
When a data entry is changed the full state of that entry is replicated to other nodes, i.e. when you
|
|
update a map the whole map is replicated.
|
|
</p>
|
|
|
|
<p>
|
|
The <code>ServiceRegistry</code> is subscribing to changes of a <code>GSet</code> where we add
|
|
the names of all services. It is also subscribing to all such service keys to get notifications when
|
|
actors are added or removed to a named service.
|
|
</p>
|
|
|
|
<p>
|
|
The multi-node test for the <code>ServiceRegistry</code> can be found in
|
|
<a href="#code/src/multi-jvm/scala/sample/distributeddata/ServiceRegistrySpec.scala" class="shortcut">ServiceRegistrySpec.scala</a>.
|
|
</p>
|
|
|
|
</div>
|
|
|
|
<div>
|
|
<h2>Replicated Cache</h2>
|
|
|
|
<p>
|
|
This example illustrates a simple key-value cache.
|
|
</p>
|
|
|
|
<p>
|
|
Open <a href="#code/src/main/scala/sample/distributeddata/ReplicatedCache.scala" class="shortcut">ReplicatedCache.scala</a>.
|
|
</p>
|
|
|
|
<p>
|
|
<code>ReplicatedCache</code> is an actor that is started on each node in the cluster.
|
|
It supports three commands: <code>PutInCache</code>, <code>GetFromCache</code> and <code>Evict</code>.
|
|
</p>
|
|
|
|
<p>
|
|
It is splitting up the key space in 100 top level keys, each with a <code>LWWMap</code>.
|
|
When a data entry is changed the full state of that entry is replicated to other nodes, i.e. when you
|
|
update a map the whole map is replicated. Therefore, instead of using one ORMap with 1000 elements it
|
|
is more efficient to split that up in 100 top level ORMap entries with 10 elements each. Top level
|
|
entries are replicated individually, which has the trade-off that different entries may not be
|
|
replicated at the same time and you may see inconsistencies between related entries.
|
|
Separate top level entries cannot be updated atomically together.
|
|
</p>
|
|
|
|
<p>
|
|
The multi-node test for the <code>ReplicatedCache</code> can be found in
|
|
<a href="#code/src/multi-jvm/scala/sample/distributeddata/ReplicatedCacheSpec.scala" class="shortcut">ReplicatedCacheSpec.scala</a>.
|
|
</p>
|
|
|
|
</div>
|
|
|
|
<div>
|
|
<h2>Replicated Metrics</h2>
|
|
|
|
<p>
|
|
This example illustrates to spread metrics data to all nodes in an Akka cluster.
|
|
</p>
|
|
|
|
<p>
|
|
Open <a href="#code/src/main/scala/sample/distributeddata/ReplicatedMetrics.scala" class="shortcut">ReplicatedMetrics.scala</a>.
|
|
</p>
|
|
|
|
<p>
|
|
<code>ReplicatedMetrics</code> is an actor that is started on each node in the cluster.
|
|
Periodically it collects some metrics, in this case used and max heap size.
|
|
Each metrics type is stored in a <code>LWWMap</code> where the key in the map is the address of
|
|
the node. The values are disseminated to other nodes with the gossip protocol.
|
|
</p>
|
|
|
|
<p>
|
|
The multi-node test for the <code>ReplicatedCache</code> can be found in
|
|
<a href="#code/src/multi-jvm/scala/sample/distributeddata/ReplicatedMetricsSpec.scala" class="shortcut">ReplicatedMetricsSpec.scala</a>.
|
|
</p>
|
|
|
|
</div>
|
|
|
|
</body>
|
|
</html>
|