=doc minor fixes in stream quickstart
This commit is contained in:
parent
b4f367e46a
commit
db353504ab
2 changed files with 34 additions and 37 deletions
|
|
@ -10,7 +10,7 @@ We will also consider the problem inherent to all non-blocking streaming
|
|||
solutions: *"What if the subscriber is too slow to consume the live stream of
|
||||
data?"*. Traditionally the solution is often to buffer the elements, but this
|
||||
can—and usually will—cause eventual buffer overflows and instability of such
|
||||
systems. Instead Akka Streams depend on internal backpressure signals that
|
||||
systems. Instead Akka Streams depend on internal backpressure signals that
|
||||
allow to control what should happen in such scenarios.
|
||||
|
||||
Here's the data model we'll be working with throughout the quickstart examples:
|
||||
|
|
@ -34,9 +34,9 @@ which will be responsible for materializing and running the streams we are about
|
|||
|
||||
The :class:`ActorMaterializer` can optionally take :class:`ActorMaterializerSettings` which can be used to define
|
||||
materialization properties, such as default buffer sizes (see also :ref:`stream-buffers-scala`), the dispatcher to
|
||||
be used by the pipeline etc. These can be overridden ``withAttributes`` on :class:`Flow`, :class:`Source`, :class:`Sink` and :class:`Graph`.
|
||||
be used by the pipeline etc. These can be overridden with ``withAttributes`` on :class:`Flow`, :class:`Source`, :class:`Sink` and :class:`Graph`.
|
||||
|
||||
Let's assume we have a stream of tweets readily available, in Akka this is expressed as a :class:`Source[Out, M]`:
|
||||
Let's assume we have a stream of tweets readily available. In Akka this is expressed as a :class:`Source[Out, M]`:
|
||||
|
||||
.. includecode:: code/docs/stream/TwitterStreamQuickstartDocSpec.scala#tweet-source
|
||||
|
||||
|
|
@ -52,14 +52,14 @@ only make sense in streaming and vice versa):
|
|||
.. includecode:: code/docs/stream/TwitterStreamQuickstartDocSpec.scala#authors-filter-map
|
||||
|
||||
Finally in order to :ref:`materialize <stream-materialization-scala>` and run the stream computation we need to attach
|
||||
the Flow to a :class:`Sink` that will get the flow running. The simplest way to do this is to call
|
||||
the Flow to a :class:`Sink` that will get the Flow running. The simplest way to do this is to call
|
||||
``runWith(sink)`` on a ``Source``. For convenience a number of common Sinks are predefined and collected as methods on
|
||||
the :class:`Sink` `companion object <http://doc.akka.io/api/akka-stream-and-http-experimental/@version@/#akka.stream.scaladsl.Sink$>`_.
|
||||
For now let's simply print each author:
|
||||
|
||||
.. includecode:: code/docs/stream/TwitterStreamQuickstartDocSpec.scala#authors-foreachsink-println
|
||||
|
||||
or by using the shorthand version (which are defined only for the most popular sinks such as ``Sink.fold`` and ``Sink.foreach``):
|
||||
or by using the shorthand version (which are defined only for the most popular Sinks such as ``Sink.fold`` and ``Sink.foreach``):
|
||||
|
||||
.. includecode:: code/docs/stream/TwitterStreamQuickstartDocSpec.scala#authors-foreach-println
|
||||
|
||||
|
|
@ -85,15 +85,14 @@ combinator:
|
|||
due to the risk of deadlock (with merge being the preferred strategy), and second, the monad laws would not hold for
|
||||
our implementation of flatMap (due to the liveness issues).
|
||||
|
||||
Please note that the mapConcat requires the supplied function to return a strict collection (``f:Out=>immutable.Seq[T]``),
|
||||
Please note that the ``mapConcat`` requires the supplied function to return a strict collection (``f:Out=>immutable.Seq[T]``),
|
||||
whereas ``flatMap`` would have to operate on streams all the way through.
|
||||
|
||||
|
||||
Broadcasting a stream
|
||||
---------------------
|
||||
Now let's say we want to persist all hashtags, as well as all author names from this one live stream.
|
||||
For example we'd like to write all author handles into one file, and all hashtags into another file on disk.
|
||||
This means we have to split the source stream into 2 streams which will handle the writing to these different files.
|
||||
This means we have to split the source stream into two streams which will handle the writing to these different files.
|
||||
|
||||
Elements that can be used to form such "fan-out" (or "fan-in") structures are referred to as "junctions" in Akka Streams.
|
||||
One of these that we'll be using in this example is called :class:`Broadcast`, and it simply emits elements from its
|
||||
|
|
@ -120,14 +119,12 @@ Both :class:`Graph` and :class:`RunnableGraph` are *immutable, thread-safe, and
|
|||
|
||||
A graph can also have one of several other shapes, with one or more unconnected ports. Having unconnected ports
|
||||
expresses a grapth that is a *partial graph*. Concepts around composing and nesting graphs in large structures are
|
||||
explained explained in detail in :ref:`composition-scala`. It is also possible to wrap complex computation graphs
|
||||
explained in detail in :ref:`composition-scala`. It is also possible to wrap complex computation graphs
|
||||
as Flows, Sinks or Sources, which will be explained in detail in
|
||||
:ref:`constructing-sources-sinks-flows-from-partial-graphs-scala`.
|
||||
|
||||
|
||||
Back-pressure in action
|
||||
-----------------------
|
||||
|
||||
One of the main advantages of Akka Streams is that they *always* propagate back-pressure information from stream Sinks
|
||||
(Subscribers) to their Sources (Publishers). It is not an optional feature, and is enabled at all times. To learn more
|
||||
about the back-pressure protocol used by Akka Streams and all other Reactive Streams compatible implementations read
|
||||
|
|
@ -153,30 +150,30 @@ So far we've been only processing data using Flows and consuming it into some ki
|
|||
values or storing them in some external system. However sometimes we may be interested in some value that can be
|
||||
obtained from the materialized processing pipeline. For example, we want to know how many tweets we have processed.
|
||||
While this question is not as obvious to give an answer to in case of an infinite stream of tweets (one way to answer
|
||||
this question in a streaming setting would to create a stream of counts described as "*up until now*, we've processed N tweets"),
|
||||
this question in a streaming setting would be to create a stream of counts described as "*up until now*, we've processed N tweets"),
|
||||
but in general it is possible to deal with finite streams and come up with a nice result such as a total count of elements.
|
||||
|
||||
First, let's write such an element counter using ``Sink.fold`` and see how the types look like:
|
||||
|
||||
.. includecode:: code/docs/stream/TwitterStreamQuickstartDocSpec.scala#tweets-fold-count
|
||||
|
||||
First we prepare a reusable ``Flow`` that will change each incoming tweet into an integer of value ``1``.
|
||||
We'll use this in order to combine those ones with a ``Sink.fold`` will sum all ``Int`` elements of the stream
|
||||
and make its result available as a ``Future[Int]``. Next we connect the ``tweets`` stream though a ``map`` step which
|
||||
converts each tweet into the number ``1``, finally we connect the flow using ``toMat`` the previously prepared Sink.
|
||||
First we prepare a reusable ``Flow`` that will change each incoming tweet into an integer of value ``1``. We'll use this in
|
||||
order to combine those with a ``Sink.fold`` that will sum all ``Int`` elements of the stream and make its result available as
|
||||
a ``Future[Int]``. Next we connect the ``tweets`` stream to ``count`` with ``via``. Finally we connect the Flow to the previously
|
||||
prepared Sink using ``toMat``.
|
||||
|
||||
Remember those mysterious ``Mat`` type parameters on ``Source[+Out, +Mat]``, ``Flow[-In, +Out, +Mat]`` and ``Sink[-In, +Mat]``?
|
||||
They represent the type of values these processing parts return when materialized. When you chain these together,
|
||||
you can explicitly combine their materialized values: in our example we used the ``Keep.right`` predefined function,
|
||||
you can explicitly combine their materialized values. In our example we used the ``Keep.right`` predefined function,
|
||||
which tells the implementation to only care about the materialized type of the stage currently appended to the right.
|
||||
As you can notice, the materialized type of sumSink is ``Future[Int]`` and because of using ``Keep.right``,
|
||||
the resulting :class:`RunnableGraph` has also a type parameter of ``Future[Int]``.
|
||||
The materialized type of ``sumSink`` is ``Future[Int]`` and because of using ``Keep.right``, the resulting :class:`RunnableGraph`
|
||||
has also a type parameter of ``Future[Int]``.
|
||||
|
||||
This step does *not* yet materialize the
|
||||
processing pipeline, it merely prepares the description of the Flow, which is now connected to a Sink, and therefore can
|
||||
be ``run()``, as indicated by its type: ``RunnableGraph[Future[Int]]``. Next we call ``run()`` which uses the implicit :class:`ActorMaterializer`
|
||||
to materialize and run the flow. The value returned by calling ``run()`` on a ``RunnableGraph[T]`` is of type ``T``.
|
||||
In our case this type is ``Future[Int]`` which, when completed, will contain the total length of our tweets stream.
|
||||
to materialize and run the Flow. The value returned by calling ``run()`` on a ``RunnableGraph[T]`` is of type ``T``.
|
||||
In our case this type is ``Future[Int]`` which, when completed, will contain the total length of our ``tweets`` stream.
|
||||
In case of the stream failing, this future would complete with a Failure.
|
||||
|
||||
A :class:`RunnableGraph` may be reused
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue