pekko/akka-docs/intro/getting-started-first-scala.rst


.. _getting-started-first-scala:

#################################################
 Getting Started Tutorial (Scala): First Chapter
#################################################


Introduction
============

Welcome to the first tutorial on how to get started with Akka and Scala. We
assume that you already know what Akka and Scala are and will now focus on the
steps necessary to start your first project.

There are two variations of this first tutorial:

- creating a standalone project and run it from the command line
- creating a SBT (Simple Build Tool) project and running it from within SBT

Since they are so similar we will present them both.

The sample application that we will create is using actors to calculate the
value of Pi. Calculating Pi is a CPU intensive operation and we will utilize
Akka Actors to write a concurrent solution that scales out to multi-core
processors. This sample will be extended in future tutorials to use Akka Remote
Actors to scale out on multiple machines in a cluster.

We will be using an algorithm that is called "embarrassingly parallel" which
just means that each job is completely isolated and not coupled with any other
job. Since this algorithm is so parallelizable it suits the actor model very
well.

Here is the formula for the algorithm we will use:

.. image:: ../images/pi-formula.png

In this particular algorithm the master splits the series into chunks which are
sent out to each worker actor to be processed. When each worker has processed
its chunk it sends a result back to the master which aggregates the total
result.


Prerequisites
=============

This tutorial assumes that you have Java 1.6 or later installed on you machine
and ``java`` on your ``PATH``. You also need to know how to run commands in a
shell (ZSH, Bash, DOS etc.) and a decent text editor or IDE to type in the Scala
code.

You need to make sure that ``$JAVA_HOME`` environment variable is set to the
root of the Java distribution. You also need to make sure that the
``$JAVA_HOME/bin`` is on your ``PATH``::

    $ export JAVA_HOME=..root of java distribution..
    $ export PATH=$PATH:$JAVA_HOME/bin

You can test your installation by invoking ``java``::

    $ java -version
    java version "1.6.0_24"
    Java(TM) SE Runtime Environment (build 1.6.0_24-b07-334-10M3326)
    Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02-334, mixed mode)


Downloading and installing Akka
===============================

To build and run the tutorial sample from the command line, you have to download
Akka. If you prefer to use SBT to build and run the sample then you can skipthis
section and jump to the next one.

Let's get the ``akka-actors-2.0-SNAPSHOT.zip`` distribution of Akka from
http://akka.io/downloads/ which includes everything we need for this
tutorial. Once you have downloaded the distribution unzip it in the folder you
would like to have Akka installed in. In my case I choose to install it in
``/Users/jboner/tools/``, simply by unzipping it to this directory.

You need to do one more thing in order to install Akka properly: set the
``AKKA_HOME`` environment variable to the root of the distribution. In my case
I'm opening up a shell, navigating down to the distribution, and setting the
``AKKA_HOME`` variable::

    $ cd /Users/jboner/tools/akka-actors-2.0-SNAPSHOT
    $ export AKKA_HOME=`pwd`
    $ echo $AKKA_HOME
    /Users/jboner/tools/akka-actors-2.0-SNAPSHOT

The distribution looks like this::

    $ ls -1
    config
    doc
    lib
    src

- In the ``config`` directory we have the Akka conf files.
- In the ``doc`` directory we have the documentation, API, doc JARs, and also
  the source files for the tutorials.
- In the ``lib`` directory we have the Scala and Akka JARs.
- In the ``src`` directory we have the source JARs for Akka.


The only JAR we will need for this tutorial (apart from the
``scala-library.jar`` JAR) is the ``akka-actor-2.0-SNAPSHOT.jar`` JAR in the ``lib/akka``
directory. This is a self-contained JAR with zero dependencies and contains
everything we need to write a system using Actors.

Akka is very modular and has many JARs for containing different features. The
core distribution has seven modules:

- ``akka-actor-2.0-SNAPSHOT.jar`` -- Standard Actors
- ``akka-typed-actor-2.0-SNAPSHOT.jar`` -- Typed Actors
- ``akka-remote-2.0-SNAPSHOT.jar`` -- Remote Actors
- ``akka-stm-2.0-SNAPSHOT.jar`` -- STM (Software Transactional Memory), transactors and transactional datastructures
- ``akka-http-2.0-SNAPSHOT.jar`` -- Akka Mist for continuation-based asynchronous HTTP and also Jersey integration
- ``akka-slf4j-2.0-SNAPSHOT.jar`` -- SLF4J Event Handler Listener for logging with SLF4J
- ``akka-testkit-2.0-SNAPSHOT.jar`` -- Toolkit for testing Actors

The Akka Microkernel distribution also includes these jars:

- ``akka-kernel-2.0-SNAPSHOT.jar`` -- Akka microkernel for running a bare-bones mini application server (embeds Jetty etc.)
- ``akka-camel-2.0-SNAPSHOT.jar`` -- Apache Camel Actors integration (it's the best way to have your Akka application communicate with the rest of the world)
- ``akka-camel-typed-2.0-SNAPSHOT.jar`` -- Apache Camel Typed Actors integration
- ``akka-spring-2.0-SNAPSHOT.jar`` -- Spring framework integration


Downloading and installing Scala
================================

To build and run the tutorial sample from the command line, you have to install
the Scala distribution. If you prefer to use SBT to build and run the sample
then you can skip this section and jump to the next one.

Scala can be downloaded from http://www.scala-lang.org/downloads. Browse there
and download the Scala 2.9.0 release. If you pick the ``tgz`` or ``zip``
distribution then just unzip it where you want it installed. If you pick the
IzPack Installer then double click on it and follow the instructions.

You also need to make sure that the ``scala-2.9.0/bin`` (if that is the
directory where you installed Scala) is on your ``PATH``::

    $ export PATH=$PATH:scala-2.9.0/bin

You can test your installation by invoking scala::

    $ scala -version
    Scala code runner version 2.9.0.final -- Copyright 2002-2011, LAMP/EPFL

Looks like we are all good. Finally let's create a source file ``Pi.scala`` for
the tutorial and put it in the root of the Akka distribution in the ``tutorial``
directory (you have to create it first).

Some tools require you to set the ``SCALA_HOME`` environment variable to the
root of the Scala distribution, however Akka does not require that.

.. _getting-started-first-scala-download-sbt:


Downloading and installing SBT
==============================

SBT, short for 'Simple Build Tool' is an excellent build system written in
Scala. It uses Scala to write the build scripts which gives you a lot of
power. It has a plugin architecture with many plugins available, something that
we will take advantage of soon. SBT is the preferred way of building software in
Scala and is probably the easiest way of getting through this tutorial. If you
want to use SBT for this tutorial then follow the following instructions, if not
you can skip this section and the next.

To install SBT and create a project for this tutorial it is easiest to follow
the instructions on https://github.com/harrah/xsbt/wiki/Setup.

Now we need to create our first Akka project. You could add the dependencies
manually to the build script, but the easiest way is to use Akka's SBT Plugin,
covered in the next section.


Creating an Akka SBT project
============================

If you have not already done so, now is the time to create an SBT project for
our tutorial. You do that by adding the following content to ``build.sbt`` file
in the directory you want to create your project in::

    name := "My Project"

    version := "1.0"

    scalaVersion := "2.9.1"

    resolvers += "Typesafe Repository" at "http://repo.typesafe.com/typesafe/releases/"

    libraryDependencies += "se.scalablesolutions.akka" % "akka-actor" % "2.0-SNAPSHOT"

Create a directory ``src/main/scala`` in which you will store the Scala source
files.

Not needed in this tutorial, but if you would like to use additional Akka
modules beyond ``akka-actor``, you can add these as ``libraryDependencies`` in
``build.sbt``. Note that there must be a blank line between each. Here is an
example adding ``akka-remote`` and ``akka-stm``::

    libraryDependencies += "se.scalablesolutions.akka" % "akka-actor" % "2.0-SNAPSHOT"

    libraryDependencies += "se.scalablesolutions.akka" % "akka-remote" % "2.0-SNAPSHOT"

    libraryDependencies += "se.scalablesolutions.akka" % "akka-stm" % "2.0-SNAPSHOT"

So, now we are all set.

SBT itself needs a whole bunch of dependencies but our project will only need
one; ``akka-actor-2.0-SNAPSHOT.jar``. SBT will download that as well.


Start writing the code
======================

Now it's about time to start hacking.

We start by creating a ``Pi.scala`` file and adding these import statements at
the top of the file:

.. includecode:: code/tutorials/first/Pi.scala#imports

If you are using SBT in this tutorial then create the file in the
``src/main/scala`` directory.

If you are using the command line tools then create the file wherever you
want. I will create it in a directory called ``tutorial`` at the root of the
Akka distribution, e.g. in ``$AKKA_HOME/tutorial/Pi.scala``.


Creating the messages
=====================

The design we are aiming for is to have one ``Master`` actor initiating the
computation, creating a set of ``Worker`` actors. Then it splits up the work
into discrete chunks, and sends these chunks to the different workers in a
round-robin fashion. The master waits until all the workers have completed their
work and sent back results for aggregation. When computation is completed the
master prints out the result, shuts down all workers and then itself.

With this in mind, let's now create the messages that we want to have flowing in
the system. We need three different messages:

- ``Calculate`` -- sent to the ``Master`` actor to start the calculation
- ``Work`` -- sent from the ``Master`` actor to the ``Worker`` actors containing
  the work assignment
- ``Result`` -- sent from the ``Worker`` actors to the ``Master`` actor
  containing the result from the worker's calculation

Messages sent to actors should always be immutable to avoid sharing mutable
state. In scala we have 'case classes' which make excellent messages. So let's
start by creating three messages as case classes.  We also create a common base
trait for our messages (that we define as being ``sealed`` in order to prevent
creating messages outside our control):

.. includecode:: code/tutorials/first/Pi.scala#messages


Creating the worker
===================

Now we can create the worker actor. This is done by mixing in the ``Actor``
trait and defining the ``receive`` method. The ``receive`` method defines our
message handler. We expect it to be able to handle the ``Work`` message so we
need to add a handler for this message:

.. includecode:: code/tutorials/first/Pi.scala#worker
   :exclude: calculatePiFor

As you can see we have now created an ``Actor`` with a ``receive`` method as a
handler for the ``Work`` message. In this handler we invoke the
``calculatePiFor(..)`` method, wrap the result in a ``Result`` message and send
it back to the original sender using ``self.reply``. In Akka the sender
reference is implicitly passed along with the message so that the receiver can
always reply or store away the sender reference for future use.

The only thing missing in our ``Worker`` actor is the implementation on the
``calculatePiFor(..)`` method. While there are many ways we can implement this
algorithm in Scala, in this introductory tutorial we have chosen an imperative
style using a for comprehension and an accumulator:

.. includecode:: code/tutorials/first/Pi.scala#calculatePiFor


Creating the master
===================

The master actor is a little bit more involved. In its constructor we need to
create the workers (the ``Worker`` actors) and start them. We will also wrap
them in a load-balancing router to make it easier to spread out the work evenly
between the workers. Let's do that first:

.. includecode:: code/tutorials/first/Pi.scala#create-workers

As you can see we are using the ``actorOf`` factory method to create actors,
this method returns as an ``ActorRef`` which is a reference to our newly created
actor.  This method is available in the ``Actor`` object but is usually
imported::

    import akka.actor.Actor.actorOf

There are two versions of ``actorOf``; one of them taking a actor type and the
other one an instance of an actor. The former one (``actorOf[MyActor]``) is used
when the actor class has a no-argument constructor while the second one
(``actorOf(new MyActor(..))``) is used when the actor class has a constructor
that takes arguments. This is the only way to create an instance of an Actor and
the ``actorOf`` method ensures this. The latter version is using call-by-name
and lazily creates the actor within the scope of the ``actorOf`` method. The
``actorOf`` method instantiates the actor and returns, not an instance to the
actor, but an instance to an ``ActorRef``. This reference is the handle through
which you communicate with the actor. It is immutable, serializable and
location-aware meaning that it "remembers" its original actor even if it is sent
to other nodes across the network and can be seen as the equivalent to the
Erlang actor's PID.

The actor's life-cycle is:

- Created & Started -- ``Actor.actorOf[MyActor]`` -- can receive messages
- Stopped -- ``actorRef.stop()`` -- can **not** receive messages

Once the actor has been stopped it is dead and can not be started again.

Now we have a router that is representing all our workers in a single
abstraction. If you paid attention to the code above, you saw that we were using
the ``nrOfWorkers`` variable. This variable and others we have to pass to the
``Master`` actor in its constructor. So now let's create the master actor. We
have to pass in three integer variables:

- ``nrOfWorkers`` -- defining how many workers we should start up
- ``nrOfMessages`` -- defining how many number chunks to send out to the workers
- ``nrOfElements`` -- defining how big the number chunks sent to each worker should be

Here is the master actor:

.. includecode:: code/tutorials/first/Pi.scala#master
   :exclude: handle-messages

A couple of things are worth explaining further.

First, we are passing in a ``java.util.concurrent.CountDownLatch`` to the
``Master`` actor. This latch is only used for plumbing (in this specific
tutorial), to have a simple way of letting the outside world knowing when the
master can deliver the result and shut down. In more idiomatic Akka code, as we
will see in part two of this tutorial series, we would not use a latch but other
abstractions and functions like ``Channel``, ``Future`` and ``?`` to achieve the
same thing in a non-blocking way. But for simplicity let's stick to a
``CountDownLatch`` for now.

Second, we are adding a couple of life-cycle callback methods; ``preStart`` and
``postStop``. In the ``preStart`` callback we are recording the time when the
actor is started and in the ``postStop`` callback we are printing out the result
(the approximation of Pi) and the time it took to calculate it. In this call we
also invoke ``latch.countDown`` to tell the outside world that we are done.

But we are not done yet. We are missing the message handler for the ``Master``
actor. This message handler needs to be able to react to two different messages:

- ``Calculate`` -- which should start the calculation
- ``Result`` -- which should aggregate the different results

The ``Calculate`` handler is sending out work to all the ``Worker`` actors and
after doing that it also sends a ``Broadcast(PoisonPill)`` message to the
router, which will send out the ``PoisonPill`` message to all the actors it is
representing (in our case all the ``Worker`` actors). ``PoisonPill`` is a
special kind of message that tells the receiver to shut itself down using the
normal shutdown method; ``self.stop``. We also send a ``PoisonPill`` to the
router itself (since it's also an actor that we want to shut down).

The ``Result`` handler is simpler, here we get the value from the ``Result``
message and aggregate it to our ``pi`` member variable. We also keep track of
how many results we have received back, and if that matches the number of tasks
sent out, the ``Master`` actor considers itself done and shuts down.

Let's capture this in code:

.. includecode:: code/tutorials/first/Pi.scala#master-receive


Bootstrap the calculation
=========================

Now the only thing that is left to implement is the runner that should bootstrap
and run the calculation for us. We do that by creating an object that we call
``Pi``, here we can extend the ``App`` trait in Scala, which means that we will
be able to run this as an application directly from the command line.

The ``Pi`` object is a perfect container module for our actors and messages, so
let's put them all there. We also create a method ``calculate`` in which we
start up the ``Master`` actor and wait for it to finish:

.. includecode:: code/tutorials/first/Pi.scala#app
   :exclude: actors-and-messages

That's it. Now we are done.

But before we package it up and run it, let's take a look at the full code now,
with package declaration, imports and all:

.. includecode:: code/tutorials/first/Pi.scala


Run it as a command line application
====================================

If you have not typed in (or copied) the code for the tutorial as
``$AKKA_HOME/tutorial/Pi.scala`` then now is the time. When that's done open up
a shell and step in to the Akka distribution (``cd $AKKA_HOME``).

First we need to compile the source file. That is done with Scala's compiler
``scalac``. Our application depends on the ``akka-actor-2.0-SNAPSHOT.jar`` JAR
file, so let's add that to the compiler classpath when we compile the source::

    $ scalac -cp lib/akka/akka-actor-2.0-SNAPSHOT.jar tutorial/Pi.scala

When we have compiled the source file we are ready to run the application. This
is done with ``java`` but yet again we need to add the
``akka-actor-2.0-SNAPSHOT.jar`` JAR file to the classpath, and this time we also
need to add the Scala runtime library ``scala-library.jar`` and the classes we
compiled ourselves::

    $ java \
        -cp lib/scala-library.jar:lib/akka/akka-actor-2.0-SNAPSHOT.jar:. \
        akka.tutorial.first.scala.Pi
    AKKA_HOME is defined as [/Users/jboner/tools/akka-actors-2.0-SNAPSHOT]
    loading config from [/Users/jboner/tools/akka-actors-2.0-SNAPSHOT/config/akka.conf].

    Pi estimate:        3.1435501812459323
    Calculation time:   858 millis

Yippee! It is working.

If you have not defined the ``AKKA_HOME`` environment variable then Akka can't
find the ``akka.conf`` configuration file and will print out a ``Can’t load
akka.conf`` warning. This is ok since it will then just use the defaults.


Run it inside SBT
=================

If you used SBT, then you can run the application directly inside SBT. First you
need to compile the project::

    $ sbt
    > compile
    ...

When this in done we can run our application directly inside SBT::

    > run
    ...
    Pi estimate:        3.1435501812459323
    Calculation time:   942 millis

Yippee! It is working.

If you have not defined an the ``AKKA_HOME`` environment variable then Akka
can't find the ``akka.conf`` configuration file and will print out a ``Can’t
load akka.conf`` warning. This is ok since it will then just use the defaults.


Conclusion
==========

We have learned how to create our first Akka project using Akka's actors to
speed up a computation-intensive problem by scaling out on multi-core processors
(also known as scaling up). We have also learned to compile and run an Akka
project using either the tools on the command line or the SBT build system.

If you have a multi-core machine then I encourage you to try out different
number of workers (number of working actors) by tweaking the ``nrOfWorkers``
variable to for example; 2, 4, 6, 8 etc. to see performance improvement by
scaling up.

Happy hakking.