254 lines
12 KiB
ReStructuredText
254 lines
12 KiB
ReStructuredText
.. _io-java-codec:
|
||
|
||
Encoding and decoding binary data
|
||
=================================
|
||
|
||
.. warning::
|
||
|
||
The IO implementation is marked as **“experimental”** as of its introduction
|
||
in Akka 2.2.0. We will continue to improve this API based on our users’
|
||
feedback, which implies that while we try to keep incompatible changes to a
|
||
minimum the binary compatibility guarantee for maintenance releases does not
|
||
apply to the contents of the `akka.io` package.
|
||
|
||
Akka adopted and adapted the implementation of data processing pipelines found
|
||
in the ``spray-io`` module. The idea is that encoding and decoding often
|
||
go hand in hand and keeping the code pertaining to one protocol layer together
|
||
is deemed more important than writing down the complete read side—say—in the
|
||
iteratee style in one go; pipelines encourage packaging the stages in a form
|
||
which lends itself better to reuse in a protocol stack. Another reason for
|
||
choosing this abstraction is that it is at times necessary to change the
|
||
behavior of encoding and decoding within a stage based on a message stream’s
|
||
state, and pipeline stages allow communication between the read and write
|
||
halves quite naturally.
|
||
|
||
The actual byte-fiddling can be done within pipeline stages, for example using
|
||
the rich API of :class:`ByteIterator` and :class:`ByteStringBuilder` as shown
|
||
below. All these activities are synchronous transformations which benefit
|
||
greatly from CPU affinity to make good use of those data caches. Therefore the
|
||
design of the pipeline infrastructure is completely synchronous, every stage’s
|
||
handler code can only directly return the events and/or commands resulting from
|
||
an input, there are no callbacks. Exceptions thrown within a pipeline stage
|
||
will abort processing of the whole pipeline under the assumption that
|
||
recoverable error conditions will be signaled in-band to the next stage instead
|
||
of raising an exception.
|
||
|
||
An overall “logical” pipeline can span multiple execution contexts, for example
|
||
starting with the low-level protocol layers directly within an actor handling
|
||
the reads and writes to a TCP connection and then being passed to a number of
|
||
higher-level actors which do the costly application level processing. This is
|
||
supported by feeding the generated events into a sink which sends them to
|
||
another actor, and that other actor will then upon reception feed them into its
|
||
own pipeline.
|
||
|
||
Introducing the Sample Protocol
|
||
-------------------------------
|
||
|
||
In the following the process of implementing a protocol stack using pipelines
|
||
is demonstrated on the following simple example:
|
||
|
||
.. code-block:: text
|
||
|
||
frameLen: Int
|
||
persons: Int
|
||
persons times {
|
||
first: String
|
||
last: String
|
||
}
|
||
points: Int
|
||
points times Double
|
||
|
||
mapping to the following data type:
|
||
|
||
.. includecode:: code/docs/io/japi/Message.java#message
|
||
|
||
We will split the handling of this protocol into two parts: the frame-length
|
||
encoding handles the buffering necessary on the read side and the actual
|
||
encoding of the frame contents is done in a separate stage.
|
||
|
||
Building a Pipeline Stage
|
||
-------------------------
|
||
|
||
As a common example, which is also included in the ``akka-actor`` package, let
|
||
us look at a framing protocol which works by prepending a length field to each
|
||
message (the following is a simplified version for demonstration purposes, the
|
||
real implementation is more configurable and implemented in Scala).
|
||
|
||
.. includecode:: code/docs/io/japi/LengthFieldFrame.java
|
||
:include: frame
|
||
|
||
In the end a pipeline stage is nothing more than a set of three methods: one
|
||
transforming commands arriving from above, one transforming events arriving
|
||
from below and the third transforming incoming management commands (not shown
|
||
here, see below for more information). The result of the transformation can in
|
||
either case be a sequence of commands flowing downwards or events flowing
|
||
upwards (or a combination thereof).
|
||
|
||
In the case above the data type for commands and events are equal as both
|
||
functions operate only on ``ByteString``, and the transformation does not
|
||
change that type because it only adds or removes four octets at the front.
|
||
|
||
The pair of command and event transformation functions is represented by an
|
||
object of type :class:`AbstractPipePair`, or in this case a
|
||
:class:`AbstractSymmetricPipePair`. This object could benefit from knowledge
|
||
about the context it is running in, for example an :class:`Actor`, and this
|
||
context is introduced by making a :class:`PipelineStage` be a factory for
|
||
producing a :class:`PipePair`. The factory method is called :meth:`apply` (a
|
||
Scala tradition) and receives the context object as its argument. The
|
||
implementation of this factory method could now make use of the context in
|
||
whatever way it sees fit, you will see an example further down.
|
||
|
||
Manipulating ByteStrings
|
||
------------------------
|
||
|
||
The second stage of our sample protocol stack illustrates in more depth what
|
||
showed only a little in the pipeline stage built above: constructing and
|
||
deconstructing byte strings. Let us first take a look at the encoder:
|
||
|
||
.. includecode:: code/docs/io/japi/MessageStage.java
|
||
:include: format
|
||
:exclude: decoding-omitted,omitted
|
||
|
||
Note how the byte order to be used by this stage is fixed in exactly one place,
|
||
making it impossible get wrong between commands and events; the way how the
|
||
byte order is passed into the stage demonstrates one possible use for the
|
||
stage’s ``context`` parameter.
|
||
|
||
The basic tool for constucting a :class:`ByteString` is a
|
||
:class:`ByteStringBuilder`. This builder is specialized for concatenating byte
|
||
representations of the primitive data types like ``Int`` and ``Double`` or
|
||
arrays thereof. Encoding a ``String`` requires a bit more work because not
|
||
only the sequence of bytes needs to be encoded but also the length, otherwise
|
||
the decoding stage would not know where the ``String`` terminates. When all
|
||
values making up the :class:`Message` have been appended to the builder, we
|
||
simply pass the resulting :class:`ByteString` on to the next stage as a command
|
||
using the optimized :meth:`singleCommand` facility.
|
||
|
||
.. warning::
|
||
|
||
The :meth:`singleCommand` and :meth:`singleEvent` methods provide a way to
|
||
generate responses which transfer exactly one result from one pipeline stage
|
||
to the next without suffering the overhead of object allocations. This means
|
||
that the returned collection object will not work for anything else (you will
|
||
get :class:`ClassCastExceptions`!) and this facility can only be used *EXACTLY
|
||
ONCE* during the processing of one input (command or event).
|
||
|
||
Now let us look at the decoder side:
|
||
|
||
.. includecode:: code/docs/io/japi/MessageStage.java
|
||
:include: decoding
|
||
|
||
The decoding side does the same things that the encoder does in the same order,
|
||
it just uses a :class:`ByteIterator` to retrieve primitive data types or arrays
|
||
of those from the underlying :class:`ByteString`. And in the end it hands the
|
||
assembled :class:`Message` as an event to the next stage using the optimized
|
||
:meth:`singleEvent` facility (see warning above).
|
||
|
||
Building a Pipeline
|
||
-------------------
|
||
|
||
Given the two pipeline stages introduced in the sections above we can now put
|
||
them to some use. First we define some message to be encoded:
|
||
|
||
.. includecode:: code/docs/io/japi/PipelineTest.java
|
||
:include: message
|
||
|
||
Then we need to create a pipeline context which satisfies our declared needs:
|
||
|
||
.. includecode:: code/docs/io/japi/PipelineTest.java
|
||
:include: byteorder
|
||
|
||
Building the pipeline and encoding this message then is quite simple:
|
||
|
||
.. includecode:: code/docs/io/japi/PipelineTest.java
|
||
:include: build-sink
|
||
|
||
First we *sequence* the two stages, i.e. attach them such that the output of
|
||
one becomes the input of the other. Then we create a :class:`PipelineSink`
|
||
which is essentially a callback interface for what shall happen with the
|
||
encoded commands or decoded events, respectively. Then we build the pipeline
|
||
using the :class:`PipelineFactory`, which returns an interface for feeding
|
||
commands and events into this pipeline instance. As a demonstration of how to
|
||
use this, we simply encode the message shown above and the resulting
|
||
:class:`ByteString` will then be sent to the ``commandHandler`` actor. Decoding
|
||
works in the same way, only using :meth:`injectEvent`.
|
||
|
||
Injecting into a pipeline using a :class:`PipelineInjector` will catch
|
||
exceptions resulting from processing the input, in which case the exception
|
||
(there can only be one per injection) is passed into the respective sink. The
|
||
default implementation of :meth:`onCommandFailure` and :meth:`onEventFailure`
|
||
will re-throw the exception (whence originates the ``throws`` declaration of
|
||
the ``inject*`` method).
|
||
|
||
Using the Pipeline’s Context
|
||
----------------------------
|
||
|
||
Up to this point there was always a parameter ``ctx`` which was used when
|
||
constructing a pipeline, but it was not explained in full. The context is a
|
||
piece of information which is made available to all stages of a pipeline. The
|
||
context may also carry behavior, provide infrastructure or helper methods etc.
|
||
It should be noted that the context is bound to the pipeline and as such must
|
||
not be accessed concurrently from different threads unless care is taken to
|
||
properly synchronize such access. Since the context will in many cases be
|
||
provided by an actor it is not recommended to share this context with code
|
||
executing outside of the actor’s message handling.
|
||
|
||
.. warning::
|
||
|
||
A PipelineContext instance *MUST NOT* be used by two different pipelines
|
||
since it contains mutable fields which are used during message processing.
|
||
|
||
Using Management Commands
|
||
-------------------------
|
||
|
||
Since pipeline stages do not have any reference to the pipeline or even to
|
||
their neighbors they cannot directly effect the injection of commands or events
|
||
outside of their normal processing. But sometimes things need to happen driven
|
||
by a timer, for example. In this case the timer would need to cause sending
|
||
tick messages to the whole pipeline, and those stages which wanted to receive
|
||
them would act upon those. In order to keep the type signatures for events and
|
||
commands useful, such external triggers are sent out-of-band, via a different
|
||
channel—the management port. One example which makes use of this facility is
|
||
the :class:`TickGenerator` which comes included with ``akka-actor`` (this is a
|
||
transcription of the Scala version which is actually included in the
|
||
``akka-actor`` JAR):
|
||
|
||
.. includecode:: code/docs/io/japi/HasActorContext.java#actor-context
|
||
|
||
.. includecode:: code/docs/io/japi/TickGenerator.java#tick-generator
|
||
|
||
This pipeline stage is to be used within an actor, and it will make use of this
|
||
context in order to schedule the delivery of ``Tick`` messages; the actor is
|
||
then supposed to feed these messages into the management port of the pipeline.
|
||
An example could look like this:
|
||
|
||
.. includecode:: code/docs/io/japi/Processor.java
|
||
:include: actor
|
||
:exclude: omitted
|
||
|
||
This actor extends our well-known pipeline with the tick generator and attaches
|
||
the outputs to functions which send commands and events to actors for further
|
||
processing. The pipeline stages will then all receive on ``Tick`` per second
|
||
which can be used like so:
|
||
|
||
.. includecode:: code/docs/io/japi/MessageStage.java
|
||
:include: mgmt-ticks
|
||
:exclude: omitted
|
||
|
||
.. note::
|
||
|
||
Management commands are delivered to all stages of a pipeline “effectively
|
||
parallel”, like on a broadcast medium. No code will actually run concurrently
|
||
since a pipeline is strictly single-threaded, but the order in which these
|
||
commands are processed is not specified.
|
||
|
||
The intended purpose of management commands is for each stage to define its
|
||
special command types and then listen only to those (where the aforementioned
|
||
``Tick`` message is a useful counter-example), exactly like sending packets on
|
||
a wifi network where every station receives all traffic but reacts only to
|
||
those messages which are destined for it.
|
||
|
||
If you need all stages to react upon something in their defined order, then
|
||
this must be modeled either as a command or event, i.e. it will be part of the
|
||
“business” type of the pipeline.
|
||
|