update/improve design docs as background info for IO layer, see #2890

2013-01-11 10:03:39 +01:00 · 2013-01-11 10:03:39 +01:00 · ab228c0941
commit ab228c0941
parent b99c5dc1ed
1 changed files with 60 additions and 51 deletions
--- a/akka-docs/rst/dev/io-layer.rst
+++ b/akka-docs/rst/dev/io-layer.rst
@ -1,73 +1,83 @@
 .. _io-layer:

-#######################
-Design of the I/O Layer
-#######################
+################
+I/O Layer Design
+################

 The ``akka.io`` package has been developed in collaboration between the Akka
-team and Mathias Doenitz & Johannes Rudolph from the `Spray framework`_. It has
-been influenced by the experiences with the ``spray-io`` module and adapted for
+and `spray.io`_ teams. Its design incorporates the experiences with the
+``spray-io`` module along with improvements that were jointly developed for
 more general consumption as an actor-based service.

-The Underlying Requirements
-===========================
+Requirements
+============

-In order to be suitable as the basic IO layer for Spray’s HTTP handling as well
-as for Akka remoting, the following requirements were driving the design:
+In order to form a general and extensible IO layer basis for a wide range of
+applications, with Akka remoting and spray HTTP being the initial ones, the
+following requirements were established as key drivers for the design:

 * scalability to millions of concurrent connections

-* lowest possible latency in getting data from the input channel into the
+* lowest possible latency in getting data from an input channel into the
  target actor’s mailbox

-* maximize throughput at the same time
+* maximal throughput

 * optional back-pressure in both directions (i.e. throttling local senders as
-  well as allowing local readers to throttle remote senders where the protocol
-  allows this)
+  well as allowing local readers to throttle remote senders, where allowed by
+  the protocol)

-* a purely actor-based API with immutable representation of data
+* a purely actor-based API with immutable data representation

 * extensibility for integrating new transports by way of a very lean SPI; the
  goal is to not force I/O mechanisms into a lowest common denominator but
  instead allow completely protocol-specific user-level APIs.

-The Basic Principle
-===================
+Basic Architecture
+==================

-Each transport implementation will be a separate Akka extension, offering an
-:class:`ActorRef` representing the main point of entry for client code: this
-manager accepts requests for establishing a communications channel (e.g.
-connect or listen on a TCP socket). Each communications channel is represented
-as one actor which is exposed to the client code for all interaction with this
-channel.
+Each transport implementation will be made available as a separate Akka
+extension, offering an :class:`ActorRef` representing the initial point of
+contact for client code. This "manager" accepts requests for establishing a
+communications channel (e.g. connect or listen on a TCP socket). Each
+communications channel is represented by one dedicated actor, which is exposed
+to client code for all interaction with this channel over its entire lifetime.

-The core piece of the implementation is the transport-specific “selector” actor;
-in the example of TCP this would wrap a :class:`java.nio.channels.Selector`.
-The channel actors register their interest in readability or writability of the
-underlying channel by sending corresponding messages to their assigned selector
-actor. An important point for achieving low latency is to hand off the actual
-reading and writing to the channel actor, so that the selector actor’s only
-responsibility is the management of the underlying selector’s key set and the
-actual select operation (which is typically blocking).
+The central element of the implementation is the transport-specific “selector”
+actor; in the case of TCP this would wrap a :class:`java.nio.channels.Selector`.
+The channel actors register their interest in readability or writability of
+their channel by sending corresponding messages to their assigned selector
+actor. However, the actual channel reading and writing is performed by the
+channel actors themselves, which frees the selector actors from time-consuming
+tasks and thereby ensures low latency. The selector actor's only responsibility
+is the management of the underlying selector's key set and the actual select
+operation, which is the only operation to typically block.

-The assignment of channels to selectors is done for the lifetime of a channel
-by the manager actor; the natural choice is to have the manager supervise the
-selectors, which in turn supervise their channels. In order to allow the
-manager to make informed decisions, the selectors keep the manager updated
-about their fill level by sending a message every time a channel is terminated.
+The assignment of channels to selectors is performed by the manager actor and
+remains unchanged for the entire lifetime of a channel. Thereby the management
+actor "stripes" new channels across one or more selector actors based on some
+implementation-specific distribution logic. This logic may be delegated (in
+part) to the selectors actors, which could, for example, choose to reject the
+assignment of a new channel when they consider themselves to be at capacity.

-Back-pressure for output is enabled by allowing the writer to specify within
-the :class:`Write` messages whether it wants to receive an acknowledgement for
-enqueuing that write to the O/S kernel.  Back-pressure for input is propagated
-by back sending a message to the channel actor which will take the underlying
-channel out of the selector until a corresponding resume command is received.
-In the case of transports with flow control—like TCP—the act of not consuming
-data from the stream at the receiving end is propagated back to the sender,
-linking these two mechanisms across the network.
+The manager actor creates (and therefore supervises) the selector actors, which
+in turn create and supervise their channel actors. The actor hierarchy of one
+single transport implementation therefore consists of three distinct actor
+levels, with the management actor at the top-, the channel actors at the leaf-
+and the selector actors at the mid-level.

-Benefits Resulting from this Design
-===================================
+Back-pressure for output is enabled by allowing the user to specify within its
+:class:`Write` messages whether it wants to receive an acknowledgement for
+enqueuing that write to the O/S kernel. Back-pressure for input is enabled by
+sending the channel actor a message which temporarily disables read interest
+for the channel until reading is re-enabled with a corresponding resume command.
+In the case of transports with flow control—like TCP—the act of not
+consuming data at the receiving end (thereby causing them to remain in the
+kernels read buffers) is propagated back to the sender, linking these two
+mechanisms across the network.
+
+Design Benefits
+===============

 Staying within the actor model for the whole implementation allows us to remove
 the need for explicit thread handling logic, and it also means that there are
@ -81,7 +91,7 @@ traditional solutions with explicit thread management and synchronization.
 Another benefit of supervision hierarchies is that clean-up of resources comes
 naturally: shutting down a selector actor will automatically clean up all
 channel actors, allowing proper closing of the channels and sending the
-appropriate messages to user-level client actors. DeathWatch allow the channel
+appropriate messages to user-level client actors. DeathWatch allows the channel
 actors to notice the demise of their user-level handler actors and terminate in
 an orderly fashion in that case as well; this naturally reduces the chances of
 leaking open channels.
@ -95,13 +105,12 @@ How to go about Adding a New Transport
 ======================================

 The best start is to study the TCP reference implementation to get a good grip
-on the basic working principle and then design an implementation which is
+on the basic working principle and then design an implementation, which is
 similar in spirit, but adapted to the new protocol in question. There are vast
 differences between I/O mechanisms (e.g. compare file I/O to a message broker)
 and the goal of this I/O layer is explicitly **not** to shoehorn all of them
-into a uniform API, which is why only the basic working principle is documented
-here.
+into a uniform API, which is why only the basic architecture ideas are
+documented here.

-
-.. _Spray framework: http://spray.io
+.. _spray.io: http://spray.io