Welcome to the first tutorial on how to get started with Akka and Java. We assume that you already know what Akka and Java are and will now focus on the steps necessary to start your first project.
There are two variations of this first tutorial:
- creating a standalone project and run it from the command line
- creating a Maven project and running it from within Maven
Since they are so similar we will present them both.
The sample application that we will create is using actors to calculate the value of Pi. Calculating Pi is a CPU intensive operation and we will utilize Akka Actors to write a concurrent solution that scales out to multi-core processors. This sample will be extended in future tutorials to use Akka Remote Actors to scale out on multiple machines in a cluster.
We will be using an algorithm that is called "embarrassingly parallel" which just means that each job is completely isolated and not coupled with any other job. Since this algorithm is so parallelizable it suits the actor model very well.
Here is the formula for the algorithm we will use:
..image:: pi-formula.png
In this particular algorithm the master splits the series into chunks which are sent out to each worker actor to be processed. When each worker has processed its chunk it sends a result back to the master which aggregates the total result.
If you want don't want to type in the code and/or set up a Maven project then you can check out the full tutorial from the Akka GitHub repository. It is in the ``akka-tutorials/akka-tutorial-first`` module. You can also browse it online `here <https://github.com/jboner/akka/tree/master/akka-tutorials/akka-tutorial-first>`_, with the actual source code `here <https://github.com/jboner/akka/blob/master/akka-tutorials/akka-tutorial-first/src/main/java/akka/tutorial/first/java/Pi.java>`_.
This tutorial assumes that you have Java 1.6 or later installed on you machine and ``java`` on your ``PATH``. You also need to know how to run commands in a shell (ZSH, Bash, DOS etc.) and a decent text editor or IDE to type in the Java code.
You need to make sure that ``$JAVA_HOME`` environment variable is set to the root of the Java distribution. You also need to make sure that the ``$JAVA_HOME/bin`` is on your ``PATH``::
$ export JAVA_HOME=..root of java distribution..
$ export PATH=$PATH:$JAVA_HOME/bin
You can test your installation by invoking ``java``::
$ java -version
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)
Downloading and installing Akka
-------------------------------
To build and run the tutorial sample from the command line, you have to download Akka. If you prefer to use SBT to build and run the sample then you can skip this section and jump to the next one.
Let's get the ``akka-1.1`` distribution of Akka core (not Akka Modules) from `http://akka.io/downloads <http://akka.io/downloads/>`_. Once you have downloaded the distribution unzip it in the folder you would like to have Akka installed in, in my case I choose to install it in ``/Users/jboner/tools/``, simply by unzipping it to this directory.
You need to do one more thing in order to install Akka properly: set the ``AKKA_HOME`` environment variable to the root of the distribution. In my case I'm opening up a shell, navigating down to the distribution, and setting the ``AKKA_HOME`` variable::
- In the ``dist`` directory we have the Akka JARs, including sources and docs.
- In the ``lib_managed/compile`` directory we have Akka's dependency JARs.
- In the ``deploy`` directory we have the sample JARs.
- In the ``scripts`` directory we have scripts for running Akka.
- Finally ``scala-library.jar`` is the JAR for the latest Scala distribution that Akka depends on.
The only JAR we will need for this tutorial (apart from the ``scala-library.jar`` JAR) is the ``akka-actor-1.1.jar`` JAR in the ``dist`` directory. This is a self-contained JAR with zero dependencies and contains everything we need to write a system using Actors.
Akka is very modular and has many JARs for containing different features. The core distribution has seven modules:
-``akka-actor-1.1.jar`` -- Standard Actors
-``akka-typed-actor-1.1.jar`` -- Typed Actors
-``akka-remote-1.1.jar`` -- Remote Actors
-``akka-stm-1.1.jar`` -- STM (Software Transactional Memory), transactors and transactional datastructures
-``akka-http-1.1.jar`` -- Akka Mist for continuation-based asynchronous HTTP and also Jersey integration
-``akka-slf4j-1.1.jar`` -- SLF4J Event Handler Listener for logging with SLF4J
-``akka-testkit-1.1.jar`` -- Toolkit for testing Actors
We also have Akka Modules containing add-on modules outside the core of Akka. You can download the Akka Modules distribution from TODO. It contains Akka core as well. We will not be needing any modules there today, but for your information the module JARs are these:
-``akka-kernel-1.1.jar`` -- Akka microkernel for running a bare-bones mini application server (embeds Jetty etc.)
-``akka-amqp-1.1.jar`` -- AMQP integration
-``akka-camel-1.1.jar`` -- Apache Camel Actors integration (it's the best way to have your Akka application communicate with the rest of the world)
-``akka-scalaz-1.1.jar`` -- Support for the Scalaz library
-``akka-spring-1.1.jar`` -- Spring framework integration
-``akka-osgi-dependencies-bundle-1.1.jar`` -- OSGi support
Downloading and installing Maven
--------------------------------
Maven is an excellent build system that can be used to build both Java and Scala projects. If you want to use Maven for this tutorial then follow the following instructions, if not you can skip this section and the next.
To install Maven it is easiest to follow the instructions on `http://maven.apache.org/download.html#Installation <http://maven.apache.org/download.html#Installation>`_.
If you have not already done so, now is the time to create a Maven project for our tutorial. You do that by stepping into the directory you want to create your project in and invoking the ``mvn`` command::
Now we have the basis for our Maven-based Akka project. Let's step into the project directory::
$ cd akka-tutorial-first-java
Here is the layout that Maven created::
akka-tutorial-first-jboner
|-- pom.xml
`-- src
|-- main
| `-- java
| `-- akka
| `-- tutorial
| `-- first
| `-- java
| `-- App.java
As you can see we already have a Java source file called ``App.java``, let's now rename it to ``Pi.java``.
We also need to edit the ``pom.xml`` build file. Let's add the dependency we need as well as the Maven repository it should download it from. It should now look something like this::
We start by creating a ``Pi.java`` file and adding these import statements at the top of the file::
package akka.tutorial.first.java;
import static akka.actor.Actors.actorOf;
import static akka.actor.Actors.poisonPill;
import static java.util.Arrays.asList;
import akka.actor.ActorRef;
import akka.actor.UntypedActor;
import akka.actor.UntypedActorFactory;
import akka.routing.CyclicIterator;
import akka.routing.InfiniteIterator;
import akka.routing.Routing.Broadcast;
import akka.routing.UntypedLoadBalancer;
import java.util.concurrent.CountDownLatch;
If you are using Maven in this tutorial then create the file in the ``src/main/java/akka/tutorial/first/java`` directory.
If you are using the command line tools then create the file wherever you want. I will create it in a directory called ``tutorial`` at the root of the Akka distribution, e.g. in ``$AKKA_HOME/tutorial/akka/tutorial/first/java/Pi.java``.
Creating the messages
---------------------
The design we are aiming for is to have one ``Master`` actor initiating the computation, creating a set of ``Worker`` actors. Then it splits up the work into discrete chunks, and sends these chunks to the different workers in a round-robin fashion. The master waits until all the workers have completed their work and sent back results for aggregation. When computation is completed the master prints out the result, shuts down all workers and then itself.
With this in mind, let's now create the messages that we want to have flowing in the system. We need three different messages:
-``Calculate`` -- sent to the ``Master`` actor to start the calculation
-``Work`` -- sent from the ``Master`` actor to the ``Worker`` actors containing the work assignment
-``Result`` -- sent from the ``Worker`` actors to the ``Master`` actor containing the result from the worker's calculation
Messages sent to actors should always be immutable to avoid sharing mutable state. So let's start by creating three messages as immutable POJOs. We also create a wrapper ``Pi`` class to hold our implementation::
public int getNrOfElements() { return nrOfElements; }
}
static class Result {
private final double value;
public Result(double value) {
this.value = value;
}
public double getValue() { return value; }
}
}
Creating the worker
-------------------
Now we can create the worker actor. This is done by extending in the ``UntypedActor`` base class and defining the ``onReceive`` method. The ``onReceive`` method defines our message handler. We expect it to be able to handle the ``Work`` message so we need to add a handler for this message::
As you can see we have now created an ``UntypedActor`` with a ``onReceive`` method as a handler for the ``Work`` message. In this handler we invoke the ``calculatePiFor(..)`` method, wrap the result in a ``Result`` message and send it back to the original sender using ``getContext().replyUnsafe(..)``. In Akka the sender reference is implicitly passed along with the message so that the receiver can always reply or store away the sender reference for future use.
The only thing missing in our ``Worker`` actor is the implementation on the ``calculatePiFor(..)`` method::
The master actor is a little bit more involved. In its constructor we need to create the workers (the ``Worker`` actors) and start them. We will also wrap them in a load-balancing router to make it easier to spread out the work evenly between the workers. Let's do that first::
static class Master extends UntypedActor {
...
static class PiRouter extends UntypedLoadBalancer {
private final InfiniteIterator<ActorRef> workers;
public PiRouter(ActorRef[] workers) {
this.workers = new CyclicIterator<ActorRef>(asList(workers));
}
public InfiniteIterator<ActorRef> seq() {
return workers;
}
}
public Master(...) {
...
// create the workers
final ActorRef[] workers = new ActorRef[nrOfWorkers];
As you can see we are using the ``actorOf`` factory method to create actors, this method returns as an ``ActorRef`` which is a reference to our newly created actor. This method is available in the ``Actors`` object but is usually imported::
One thing to note is that we used two different versions of the ``actorOf`` method. For creating the ``Worker`` actor we just pass in the class but to create the ``PiRouter`` actor we can't do that since the constructor in the ``PiRouter`` class takes arguments, instead we need to use the ``UntypedActorFactory`` which unfortunately is a bit more verbose.
``actorOf`` is the only way to create an instance of an Actor, this is enforced by Akka runtime. The ``actorOf`` method instantiates the actor and returns, not an instance to the actor, but an instance to an ``ActorRef``. This reference is the handle through which you communicate with the actor. It is immutable, serializable and location-aware meaning that it "remembers" its original actor even if it is sent to other nodes across the network and can be seen as the equivalent to the Erlang actor's PID.
The actor's life-cycle is:
- Created -- ``Actor.actorOf[MyActor]`` -- can **not** receive messages
- Started -- ``actorRef.start()`` -- can receive messages
- Stopped -- ``actorRef.stop()`` -- can **not** receive messages
Once the actor has been stopped it is dead and can not be started again.
Now we have a router that is representing all our workers in a single abstraction. If you paid attention to the code above, you saw that we were using the ``nrOfWorkers`` variable. This variable and others we have to pass to the ``Master`` actor in its constructor. So now let's create the master actor. We have to pass in three integer variables:
-``nrOfWorkers`` -- defining how many workers we should start up
-``nrOfMessages`` -- defining how many number chunks to send out to the workers
-``nrOfElements`` -- defining how big the number chunks sent to each worker should be
Here is the master actor::
static class Master extends UntypedActor {
private final int nrOfMessages;
private final int nrOfElements;
private final CountDownLatch latch;
private double pi;
private int nrOfResults;
private long start;
private ActorRef router;
static class PiRouter extends UntypedLoadBalancer {
private final InfiniteIterator<ActorRef> workers;
public PiRouter(ActorRef[] workers) {
this.workers = new CyclicIterator<ActorRef>(asList(workers));
First, we are passing in a ``java.util.concurrent.CountDownLatch`` to the ``Master`` actor. This latch is only used for plumbing (in this specific tutorial), to have a simple way of letting the outside world knowing when the master can deliver the result and shut down. In more idiomatic Akka code, as we will see in part two of this tutorial series, we would not use a latch but other abstractions and functions like ``Channel``, ``Future`` and ``!!!`` to achieve the same thing in a non-blocking way. But for simplicity let's stick to a ``CountDownLatch`` for now.
Second, we are adding a couple of life-cycle callback methods; ``preStart`` and ``postStop``. In the ``preStart`` callback we are recording the time when the actor is started and in the ``postStop`` callback we are printing out the result (the approximation of Pi) and the time it took to calculate it. In this call we also invoke ``latch.countDown()`` to tell the outside world that we are done.
But we are not done yet. We are missing the message handler for the ``Master`` actor. This message handler needs to be able to react to two different messages:
-``Calculate`` -- which should start the calculation
-``Result`` -- which should aggregate the different results
The ``Calculate`` handler is sending out work to all the ``Worker`` actors and after doing that it also sends a ``new Broadcast(poisonPill())`` message to the router, which will send out the ``PoisonPill`` message to all the actors it is representing (in our case all the ``Worker`` actors). ``PoisonPill`` is a special kind of message that tells the receiver to shut itself down using the normal shutdown method; ``getContext().stop()``, and is created through the ``poisonPill()`` method. We also send a ``PoisonPill`` to the router itself (since it's also an actor that we want to shut down).
The ``Result`` handler is simpler, here we get the value from the ``Result`` message and aggregate it to our ``pi`` member variable. We also keep track of how many results we have received back, and if that matches the number of tasks sent out, the ``Master`` actor considers itself done and shuts down.
Now the only thing that is left to implement is the runner that should bootstrap and run the calculation for us. We do that by adding a ``main`` method to the enclosing ``Pi`` class in which we create a new instance of ``Pi`` and invoke method ``calculate`` in which we start up the ``Master`` actor and wait for it to finish::
public class Pi {
public static void main(String[] args) throws Exception {
return new Master(nrOfWorkers, nrOfMessages, nrOfElements, latch);
}
}).start();
// start the calculation
master.sendOneWay(new Calculate());
// wait for master to shut down
latch.await();
}
}
Run it as a command line application
------------------------------------
To build and run the tutorial from the command line, you need to have the Scala library JAR on the classpath.
Scala can be downloaded from `http://www.scala-lang.org/downloads <http://www.scala-lang.org/downloads>`_. Browse there and download the Scala 2.9.0.RC1 release. If you pick the ``tgz`` or ``zip`` distribution then just unzip it where you want it installed. If you pick the IzPack Installer then double click on it and follow the instructions.
The ``scala-library.jar`` resides in the ``scala-2.9.0.RC1/lib`` directory. Copy that to your project directory.
If you have not typed in (or copied) the code for the tutorial as ``$AKKA_HOME/tutorial/akka/tutorial/first/java/Pi.java`` then now is the time. When that's done open up a shell and step in to the Akka distribution (``cd $AKKA_HOME``).
First we need to compile the source file. That is done with Java's compiler ``javac``. Our application depends on the ``akka-actor-1.1.jar`` and the ``scala-library.jar`` JAR files, so let's add them to the compiler classpath when we compile the source::
When we have compiled the source file we are ready to run the application. This is done with ``java`` but yet again we need to add the ``akka-actor-1.1.jar`` and the ``scala-library.jar`` JAR files to the classpath as well as the classes we compiled ourselves::
If you have not defined the ``AKKA_HOME`` environment variable then Akka can't find the ``akka.conf`` configuration file and will print out a ``Can’t load akka.conf`` warning. This is ok since it will then just use the defaults.
If you have not defined an the ``AKKA_HOME`` environment variable then Akka can't find the ``akka.conf`` configuration file and will print out a ``Can’t load akka.conf`` warning. This is ok since it will then just use the defaults.
Conclusion
----------
We have learned how to create our first Akka project using Akka's actors to speed up a computation-intensive problem by scaling out on multi-core processors (also known as scaling up). We have also learned to compile and run an Akka project using either the tools on the command line or the SBT build system.
If you have a multi-core machine then I encourage you to try out different number of workers (number of working actors) by tweaking the ``nrOfWorkers`` variable to for example; 2, 4, 6, 8 etc. to see performance improvement by scaling up.
Now we are ready to take on more advanced problems. In the next tutorial we will build on this one, refactor it into more idiomatic Akka and Scala code, and introduce a few new concepts and abstractions. Whenever you feel ready, join me in the `Getting Started Tutorial: Second Chapter <TODO>`_.