Merge pull request #16945 from akka/wip-stream-design-docs-1

overhaul stream-design-rst after recent changes
2015-02-26 17:16:55 +01:00 · 2015-02-26 17:16:55 +01:00 · 364774cd6c
commit 364774cd6c
parent 81fb5c14cb 1b90aae878
1 changed files with 9 additions and 10 deletions
--- a/akka-docs-dev/rst/stream-design.rst
+++ b/akka-docs-dev/rst/stream-design.rst
@ -25,7 +25,7 @@ This means that we provide all the tools necessary to express any stream process
 Resulting Implementation Constraints
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-Compositionality entails reusability of partial stream topologies, which led us to the lifted approach of describing data flows as (Partial)FlowGraphs that can act as composite sources, flows and sinks of data. These building blocks shall then be freely shareable, with the ability to combine them freely to form larger flows. The representation of these pieces must therefore be an immutable blueprint that is materialized in an explicit step in order to start the stream processing; the resulting stream processing engine is then also immutable in the sense of having a fixed topology that is prescribed by the blueprint; dynamic networks need to be modeled by explicitly using the Reactive Streams interfaces for plugging different engines together.
+Compositionality entails reusability of partial stream topologies, which led us to the lifted approach of describing data flows as (partial) FlowGraphs that can act as composite sources, flows (a.k.a. pipes) and sinks of data. These building blocks shall then be freely shareable, with the ability to combine them freely to form larger flows. The representation of these pieces must therefore be an immutable blueprint that is materialized in an explicit step in order to start the stream processing. The resulting stream processing engine is then also immutable in the sense of having a fixed topology that is prescribed by the blueprint. Dynamic networks need to be modeled by explicitly using the Reactive Streams interfaces for plugging different engines together.

 The process of materialization may be parameterized, e.g. instantiating a blueprint for handling a TCP connection’s data with specific information about the connection’s address and port information. Additionally, materialization will often create specific objects that are useful to interact with the processing engine once it is running, for example for shutting it down or for extracting metrics. This means that the materialization function takes a set of parameters from the outside and it produces a set of results. Compositionality demands that these two sets cannot interact, because that would establish a covert channel by which different pieces could communicate, leading to problems of initialization order and inscrutable runtime failures.

@ -34,9 +34,9 @@ Another aspect of materialization is that we want to support distributed stream
 Interoperation with other Reactive Streams implementations
 ----------------------------------------------------------

-Akka Streams fully implement the Reactive Streams specification and interoperate with all other conformant implementations. We chose to completely separate the Reactive Streams interfaces (which we regard to be an SPI) from the user-level API. In order to obtain a :class:`Publisher` or :class:`Subscriber` from an Akka Stream topology, a corresponding :class:`PublisherSink` or :class:`SubscriberSource` must be used.
+Akka Streams fully implement the Reactive Streams specification and interoperate with all other conformant implementations. We chose to completely separate the Reactive Streams interfaces (which we regard to be an SPI) from the user-level API. In order to obtain a :class:`Publisher` or :class:`Subscriber` from an Akka Stream topology, a corresponding ``Sink.publisher`` or ``Source.subscriber`` element must be used.

-All stream Processors produced by the default materialization of Akka Streams are restricted to having a single Subscriber, additional Subscribers will be rejected. The reason for this is that the stream topologies described using our DSL never require fan-out behavior from the Publisher sides of the elements, all fan-out is done using explicit elements like ``Broadcast[T]``.
+All stream Processors produced by the default materialization of Akka Streams are restricted to having a single Subscriber, additional Subscribers will be rejected. The reason for this is that the stream topologies described using our DSL never require fan-out behavior from the Publisher sides of the elements, all fan-out is done using explicit elements like :class:`Broadcast[T]`.

 This means that ``Sink.fanoutPublisher`` must be used where multicast behavior is needed for interoperation with other Reactive Streams implementations.

@ -64,9 +64,8 @@ Akka Streams must enable a library to express any stream processing utility in t
  * Source: something with exactly one output stream
  * Sink: something with exactly one input stream
  * Flow: something with exactly one input and one output stream
-  * BidirectionalFlow: something with exactly two input streams and two output streams that behave like two Flows of opposite direction
-
-Other topologies can always be expressed as a combination of a PartialFlowGraph with a set of inputs and a set of outputs. The preferred form of such an expression is an object that combines these three elements, favoring object composition over class inheritance.
+  * BidirectionalFlow: something with exactly two input streams and two output streams that conceptually behave like two Flows of opposite direction
+  * Graph: a packaged stream processing topology that exposes a certain set of input and output ports, characterized by an object of type :class:`Shape`.

 .. note::

@ -101,10 +100,10 @@ The finer points of stream materialization

 It is commonly necessary to parameterize a flow so that it can be materialized for different arguments, an example would be the handler Flow that is given to a server socket implementation and materialized for each incoming connection with information about the peer’s address. On the other hand it is frequently necessary to retrieve specific objects that result from materialization, for example a ``Future[Unit]`` that signals the completion of a ``ForeachSink``.

-It might be tempting to allow different pieces of a flow topology to access the materialization results of other pieces in order to customize their behavior, but that would violate composability and reusability as argued above. Therefore stream materialization is instead split into three phases:
+It might be tempting to allow different pieces of a flow topology to access the materialization results of other pieces in order to customize their behavior, but that would violate composability and reusability as argued above. Therefore the arguments and results of materialization need to be segregated:

-  * **Create:** first all execution units (Actors) are created, having access to the set of input parameters for the current materialization and producing key–value pairs that are placed in the MaterializedMap,
-  * **Resolve:** each flow element may declare derived keys that are calculated from other keys and added to the MaterializedMap; a derived key cannot depend on another derived key,
-  * **Initialize:** each flow element is finally initialized with the full MaterializedMap from the previous two phases; this will usually not do anything, but it allows certain elements to calculate their real behavior at this late stage.
+  * The FlowMaterializer is configured with a (type-safe) mapping from keys to values, which is exposed to the processing stages during their materialization.
+  * The values in this mapping may act as channels, for example by using a Promise/Future pair to communicate a value; another possibility for such information-passing is of course to explicitly model it as a stream of configuration data elements within the graph itself.
+  * The materialized values obtained from the processing stages are combined as prescribed by the user, but can of course be dependent on the values in the argument mapping.

 To avoid having to use ``Future`` values as key bindings, materialization itself may become fully asynchronous. This would allow for example the use of the bound server port within the rest of the flow, and only if the port was actually bound successfully. The downside is that some APIs will then return ``Future[MaterializedMap]``, which means that others will have to accept this in turn in order to keep the usage burden as low as possible.