rework java/stream-graph.rst

This commit is contained in:
Roland Kuhn 2015-02-27 13:11:23 +01:00
parent 0c33e64340
commit 46a13f9f2b
2 changed files with 54 additions and 36 deletions

View file

@ -27,19 +27,19 @@ Akka Streams currently provide these junctions:
* **Fan-out**
- ``Broadcast<T>`` (1 input, n outputs) signals each output given an input signal,
- ``Balance<T>`` (1 input => n outputs), signals one of its output ports given an input signal,
- ``UnZip<A,B>`` (1 input => 2 outputs), which is a specialized element which is able to split a stream of ``Pair<A,B>`` into two streams one type ``A`` and one of type ``B``,
- ``FlexiRoute<In>`` (1 input, n outputs), which enables writing custom fan out elements using a simple DSL,
- ``Broadcast[T]`` *(1 input, N outputs)* given an input element emits to each output
- ``Balance[T]`` *(1 input, N outputs)* given an input element emits to one of its output ports
- ``UnZip[A,B]`` *(1 input, 2 outputs)* splits a stream of ``(A,B)`` tuples into two streams, one of type ``A`` and one of type ``B``
- ``FlexiRoute[In]`` *(1 input, N outputs)* enables writing custom fan out elements using a simple DSL
* **Fan-in**
- ``Merge<In>`` (n inputs , 1 output), picks signals randomly from inputs pushing them one by one to its output,
- ``MergePreferred<In>`` like :class:`Merge` but if elements are available on ``preferred`` port, it picks from it, otherwise randomly from ``others``,
- ``ZipWith<A,B,...,Out>`` (n inputs (defined upfront), 1 output), which takes a function of n inputs that, given all inputs are signalled, transforms and emits 1 output,
- ``Zip<A,B>`` (2 inputs, 1 output), which is a :class:`ZipWith` specialised to zipping input streams of ``A`` and ``B`` into a ``Pair<A,B>`` stream,
- ``Concat<A>`` (2 inputs, 1 output), which enables to concatenate streams (first consume one, then the second one), thus the order of which stream is ``first`` and which ``second`` matters,
- ``FlexiMerge<Out>`` (n inputs, 1 output), which enables writing custom fan-in elements using a simple DSL.
- ``Merge[In]`` *(N inputs , 1 output)* picks randomly from inputs pushing them one by one to its output
- ``MergePreferred[In]`` like :class:`Merge` but if elements are available on ``preferred`` port, it picks from it, otherwise randomly from ``others``
- ``ZipWith[A,B,...,Out]`` *(N inputs, 1 output)* which takes a function of N inputs that given a value for each input emits 1 output element
- ``Zip[A,B]`` *(2 inputs, 1 output)* is a :class:`ZipWith` specialised to zipping input streams of ``A`` and ``B`` into an ``(A,B)`` tuple stream
- ``Concat[A]`` *(2 inputs, 1 output)* concatenates two streams (first consume one, then the second one)
- ``FlexiMerge[Out]`` *(N inputs, 1 output)* enables writing custom fan-in elements using a simple DSL
One of the goals of the FlowGraph DSL is to look similar to how one would draw a graph on a whiteboard, so that it is
simple to translate a design from whiteboard to code and be able to relate those two. Let's illustrate this by translating
@ -49,7 +49,7 @@ the below hand drawn graph into Akka Streams:
Such graph is simple to translate to the Graph DSL since each linear element corresponds to a :class:`Flow`,
and each circle corresponds to either a :class:`Junction` or a :class:`Source` or :class:`Sink` if it is beginning
or ending a :class:`Flow`. Those are connected with the ``addEdge`` method of the :class:`FlowGraphBuilder`.
or ending a :class:`Flow`.
.. includecode:: ../../../akka-samples/akka-docs-java-lambda/src/test/java/docs/stream/FlowGraphDocTest.java#simple-flow-graph
@ -58,14 +58,23 @@ or ending a :class:`Flow`. Those are connected with the ``addEdge`` method of th
refers to the same location in the resulting graph).
By looking at the snippets above, it should be apparent that the :class:`FlowGraphBuilder` object is *mutable*.
By looking at the snippets above, it should be apparent that the ``builder`` object is *mutable*.
The reason for this design choice is to enable simpler creation of complex graphs, which may even contain cycles.
Once the FlowGraph has been constructed though, the :class:`FlowGraph` instance *is immutable, thread-safe, and freely shareable*.
Once the FlowGraph has been constructed though, the :class:`RunnableFlow` instance *is immutable, thread-safe, and freely shareable*.
The same is true of all flow pieces—sources, sinks, and flows—once they are constructed.
This means that you can safely re-use one given Flow in multiple places in a processing graph.
Linear Flows however are always immutable and appending an operation to a Flow always returns a new Flow instance.
This means that you can safely re-use one given Flow in multiple places in a processing graph. In the example below
we prepare a graph that consists of two parallel streams, in which we re-use the same instance of :class:`Flow`,
yet it will properly be materialized as two connections between the corresponding Sources and Sinks:
We have seen examples of such re-use already above: the merge and broadcast junctions were imported
into the graph using ``builder.graph(...)``, an operation that will make a copy of the blueprint that
is passed to it and return the inlets and outlets of the resulting copy so that they can be wired up.
Another alternative is to pass existing graphs—of any shape—into the factory method that produces a
new graph. The difference between these approaches is that importing using ``b.graph(...)`` ignores the
materialized value of the imported graph while importing via the factory method allows its inclusion;
for more details see :ref:`stream-materialization-scala`.
In the example below we prepare a graph that consists of two parallel streams,
in which we re-use the same instance of :class:`Flow`, yet it will properly be
materialized as two connections between the corresponding Sources and Sinks:
.. includecode:: ../../../akka-samples/akka-docs-java-lambda/src/test/java/docs/stream/FlowGraphDocTest.java#flow-graph-reusing-a-flow
@ -73,55 +82,62 @@ yet it will properly be materialized as two connections between the correspondin
Constructing and combining Partial Flow Graphs
----------------------------------------------
Sometimes it is not possible (or needed) to construct the entire computation graph in one place, but instead construct
all of its different phases in different places and in the end connect them all into a complete graph and run it.
This can be achieved using :class:`PartialFlowGraph`. The reason of representing it as a different type is that a
:class:`FlowGraph` requires all ports to be connected, and if they are not it will throw an exception at construction
time, which helps to avoid simple wiring errors while working with graphs. A partial flow graph however does not perform
this validation, and allows graphs that are not yet fully connected.
This can be achieved using ``FlowGraph.factory().partial()`` instead of
``FlowGraph.factory().closed()``, which will return a ``Graph`` instead of a
``RunnableFlow``. The reason of representing it as a different type is that a
:class:`RunnableFlow` requires all ports to be connected, and if they are not
it will throw an exception at construction time, which helps to avoid simple
wiring errors while working with graphs. A partial flow graph however allows
you to return the set of yet to be connected ports from the code block that
performs the internal wiring.
A :class:`PartialFlowGraph` is defined as a :class:`FlowGraph` which contains so called "undefined elements",
such as ``UndefinedSink<T>`` or ``UndefinedSource<T>``, which can be reused and plugged into by consumers of that
partial flow graph. Let's imagine we want to provide users with a specialized element that given 3 inputs will pick
the greatest int value of each zipped triple. We'll want to expose 3 input ports (undefined sources) and one output port
(undefined sink).
Let's imagine we want to provide users with a specialized element that given 3 inputs will pick
the greatest int value of each zipped triple. We'll want to expose 3 input ports (unconnected sources) and one output port
(unconnected sink).
.. includecode:: ../../../akka-samples/akka-docs-java-lambda/src/test/java/docs/stream/StreamPartialFlowGraphDocTest.java#simple-partial-flow-graph
As you can see, first we construct the partial graph that contains all the zipping and comparing of stream
elements, then we import it (all of its nodes and connections) explicitly to the :class:`FlowGraph` instance in which all
As you can see, first we construct the partial graph that describes how to compute the maximum of two input streams, then
we reuse that twice while constructing the partial graph that extends this to three input streams,
then we import it (all of its nodes and connections) explicitly to the :class:`FlowGraph` instance in which all
the undefined elements are rewired to real sources and sinks. The graph can then be run and yields the expected result.
.. warning::
Please note that a :class:`FlowGraph` is not able to provide compile time type-safety about whether or not all
elements have been properly connected - this validation is performed as a runtime check during the graph's instantiation.
elements have been properly connected—this validation is performed as a runtime check during the graph's instantiation.
A partial flow graph also verifies that all ports are either connected or part of the returned :class:`Shape`.
.. _constructing-sources-sinks-flows-from-partial-graphs-java:
Constructing Sources, Sinks and Flows from Partial Graphs
---------------------------------------------------------
Instead of treating a :class:`PartialFlowGraph` as simply a collection of flows and junctions which may not yet all be
connected it is sometimes useful to expose such a complex graph as a simpler structure,
such as a :class:`Source`, :class:`Sink` or :class:`Flow`.
In fact, these concepts can be easily expressed as special cases of a partially connected graph:
* :class:`Source` is a partial flow graph with *exactly one* :class:`UndefinedSink`,
* :class:`Sink` is a partial flow graph with *exactly one* :class:`UndefinedSource`,
* :class:`Flow` is a partial flow graph with *exactly one* :class:`UndefinedSource` and *exactly one* :class:`UndefinedSource`.
* :class:`Source` is a partial flow graph with *exactly one* output, that is it returns a :class:`SourceShape`.
* :class:`Sink` is a partial flow graph with *exactly one* input, that is it returns a :class:`SinkShape`.
* :class:`Flow` is a partial flow graph with *exactly one* input and *exactly one* output, that is it returns a :class:`FlowShape`.
Being able to hide complex graphs inside of simple elements such as Sink / Source / Flow enables you to easily create one
complex element and from there on treat it as simple compound stage for linear computations.
In order to create a Source from a partial flow graph ``Source`` provides a special apply method that takes a function
that must return an ``UndefinedSink``. This undefined sink will become "the sink that must be attached before this Source
can run". Refer to the example below, in which we create a Source that zips together two numbers, to see this graph
that must return an :class:`Outlet<T>`. This unconnected sink will become “the sink that must be attached before this Source
can run. Refer to the example below, in which we create a Source that zips together two numbers, to see this graph
construction in action:
.. includecode:: ../../../akka-samples/akka-docs-java-lambda/src/test/java/docs/stream/StreamPartialFlowGraphDocTest.java#source-from-partial-flow-graph
Similarly the same can be done for a ``Sink<T>``, in which case the returned value must be an ``UndefinedSource<T>``.
Similarly the same can be done for a ``Sink<T>``, in which case the returned value must be an ``Inlet<T>``.
For defining a ``Flow<T>`` we need to expose both an undefined source and sink:
.. includecode:: ../../../akka-samples/akka-docs-java-lambda/src/test/java/docs/stream/StreamPartialFlowGraphDocTest.java#flow-from-partial-flow-graph