Commit graph

279 commits

Author SHA1 Message Date
Konrad `ktoso` Malawski
dcd8cea32e #21475 moving compressions ownership to Decoder (#22047)
* WIP early preview of moving compressions ownership to Decoder

* Compression table created in transport, but owned by Decoder
Added test for restart of inbound stream

* =art snapshot not needed in HeavyHitters since owned by Decoder
2017-01-13 10:33:55 +01:00
Philippus Baalman
6c7085252a extended copyright into 2017 2017-01-04 17:37:15 +01:00
Johannes Rudolph
af377790b0
=rem #21365 less aggressive busy spinning in AeronSource
Benchmarks revealed that busy spinning directly in the graph stage can
lead to an excessive increase in latency when multiple inbound lanes are
active (i.e. the inbound flow has an asynchronous boundary driving the
multiple lanes).

The new strategy is therefore:

For inbound-lanes > 1 or idle-cpu-level < 5: no spinning in the graph stage
For inbound-lanes = 1 and idle-cpu-level >= 6: 50 * settings.Advanced.IdleCpuLevel - 240

which means in general much less or no spinning at all.

Fixes #21365.
2017-01-02 16:27:52 +01:00
Johannes Rudolph
e66cb028b0 =rem #21365 enable multiple lanes in MaxThroughputSpec
This needed the other change for each sender to send to all of the target
actors. Otherwise, large batches of messages to the same target actor would
limit the potential of actually doing work in parallel with multiple lanes due
to head-of-line blocking.
2016-12-30 12:52:42 +01:00
Johannes Rudolph
35feef8d01 =rem log results of MaxThroughputSpec and LatencySpec to result file 2016-12-30 12:50:24 +01:00
Johannes Rudolph
2f5f93daa2 =rem #21365 use default directory for shared media driver to /dev/shm
It was reported that shared media driver performance can depend on the
kind of file-system where the files are contained. /dev/shm is an in-memory
filesystem that was reported to work well with the shared aeron media driver.
2016-12-30 12:32:12 +01:00
Johan Andrén
2679be5ae4 Disable serialization warnings in akka test suites #21882 2016-11-23 12:02:36 +01:00
Johan Andrén
8ae0c9a888 Use long uid in artery remoting and cluster #20644 2016-09-26 15:34:59 +02:00
Endre Sándor Varga
9f7389448a Fix AFR file deletion on Windows 2016-09-20 12:38:58 +02:00
Johan Andrén
a939e30b49 Fix artery test file leak #21484
* Include actor system name in artery dir path to ease debugging leaks
* Base class name changed to make actor system autonaming work
* Add shutdown hook directly in transport start
* Wait for completion in shutdown hook (actual leak fix)
2016-09-19 13:22:54 +02:00
Patrik Nordwall
76c23a7880 fix many bugs in InboundCompressions, #21464
* comprehensive integration test that revealed many bugs
* confirmations of manifests were wrong, at two places
* using wrong tables when system is restarted, including
  originUid in the tables with checks when receiving advertisments
* close (stop scheduling) of advertisments when new incarnation,
  quarantine, or restart
* cleanup how deadLetters ref was treated, and made it more robust
* make Decoder tolerant to decompression failures, can happen in
  case of system restart before handshake completed
* give up resending advertisment after a few attempts without confirmation,
  to avoid keeping outbound association open to possible dead system
* don't advertise new table when no inbound messages,
  to avoid keeping outbound association open to possible dead system
* HeaderBuilder could use manifest field from previous message, added
  resetMessageFields
* No compression for ArteryMessage, e.g. handshake messages must go
  through without depending on compression tables being in sync
* improve debug logging, including originUid
2016-09-19 11:37:44 +02:00
Johan Andrén
392ca5ecce Enable flight recorder in tests #21205
* Setting to configure where the flight recorder puts its file
* Run ArteryMultiNodeSpecs with flight recorder enabled
* More cleanup in exit hook, wait for task runner to stop
* Enable flight recorder for the cluster multi node tests
* Enable flight recorder for multi node remoting tests
* Toggle always-dump flight recorder output when akka.remote.artery.always-dump-flight-recorder is set
2016-09-16 15:12:40 +02:00
Patrik Nordwall
d8bb0ef476 Merge pull request #21406 from akka/wip-21371-prio-patriknw
No ack delivery for prio messages, #21371
2016-09-09 15:41:54 +02:00
Patrik Nordwall
7513617070 Merge pull request #21417 from drewhk/wip-20623-cleanup-aeron-files-drewhk
#20623 Make sure external (mapped) resources are properly cleaned on shutdown
2016-09-09 15:23:13 +02:00
Patrik Nordwall
1584c52190 handle longer network partitions, #21399
* system messages in flight should not trigger premature quarantine
  in case of longer network partitions, therefore we keep the control
  stream alive
* add give-up-system-message-after property that is used by both
  SystemMessageDelivery and AeronSink in the control stream
* also unwrap SystemMessageEnvelope in RemoteDeadLetterActorRef
* skip sending control messages after shutdown, can be triggered
  by scheduled compression advertisment
2016-09-09 14:35:50 +02:00
Endre Sándor Varga
0d77034adc 20623 Make sure external (mapped) resources are properly cleaned on shutdown 2016-09-09 14:29:04 +02:00
Patrik Nordwall
ae11fb3b45 Merge pull request #21413 from akka/wip-21339-enable-misc-serial-patriknw
enable misc serializers by default for Artery, #21339
2016-09-09 14:29:02 +02:00
Martynas Mickevičius
1ce7d7d7e9 #20946 Add bind address (#21404) 2016-09-09 12:46:50 +02:00
Patrik Nordwall
97e0628173 enable misc serializers by default for Artery, #21339
* placed them in a new section additional-serialization-bindings,
  which is included by default when Artery is enabled
* can also be enabled with enable-additional-serialization-bindings
  flag to simplify usage with old remoting
* added a JavaSerializable marker trait that is bound to JavaSerializer
  in testkit, this can be used in tests so that we eventually can run
  tests without the java.io.Serializable binding
2016-09-09 09:01:15 +02:00
Patrik Nordwall
3b7a7dfa59 add reason param to quarantine method 2016-09-08 18:00:37 +02:00
Patrik Nordwall
faf941b4c8 support for parallel lanes, #21207
* for parallel serialziation/deserialization
* MergeHub for the outbound lanes
* BroadcastHub + filter for the inbound lanes, until we
  have a PartitionHub
* simplify materialization of test stage
* add RemoteSendConsistencyWithThreeLanesSpec
2016-09-05 12:42:33 +02:00
Martynas Mickevičius
292face28a #20587 Clean artery configuration (#21279)
* Move artery settings from remoting settings to dedicated class.
* #20587 Move hardcoded settings to configuration file.
* Copy reused settings from remote to the artery
2016-09-01 08:07:39 +02:00
Patrik Nordwall
8ab02738b7 Merge branch 'master' into wip-sync-artery-dev-2.4.9-patriknw 2016-08-23 20:14:15 +02:00
Endre Sándor Varga
5e830323f6 Updating to ScalaTest 3.0.0 and ScalaCheck 1.13.2 2016-08-22 11:13:49 +02:00
Konrad Malawski
218f81196c =htp multinode latency spec for HTTP (#20964) 2016-07-15 18:53:13 +02:00
Patrik Nordwall
57ca273903 adjust the hit count sampling with the rate 2016-07-07 10:29:09 +02:00
Patrik Nordwall
95a81e41f9 enable compression by default 2016-07-06 23:07:59 +02:00
Patrik Nordwall
c376ac0c53 remove burstiness in latency tests
* throttle generates bursts but for fair latency tests
  we want the messages to be spread uniformly

* not much need for exploratory testing with AeronStreamsApp
  any longer, not worth to maintain it

* make it possible to run MaxThroughputSpec with old remoting

* add metrics for the task runner, with flight recorder

* tune idle-cpu-level
2016-07-06 20:53:05 +02:00
Patrik Nordwall
d2657a5969 adaptive sampling of hit counting
* when rate exceeds 1000 msg/s adaptive sampling of the
  heavy hitters tracking is enabled by sampling every 256th message
* also fixed some bugs related to advertise in progress

* update InboundCompression state atomically

* enable compression in LatencySpec
2016-07-05 19:54:53 +02:00
Konrad Malawski
d1015c1dc6 Compression tables properly *used* for Outgoing Compression (#20874)
* =art now correctly compresses and 2 table mode working
* =art AGRESSIVELY optimising hashing, not convienved about correctness yet
* fix HandshakeShouldDropCompressionTableSpec
2016-07-04 16:48:11 +02:00
Patrik Nordwall
b2089d06a7 new OutboundEnvelope
* instead of the old Send
* optional recipient, remove of dummy
* pool of OutboundEnvelope
2016-07-01 14:06:48 +02:00
Patrik Nordwall
a021eb5ff4 flush messages on shutdown, #20811
* StreamSupervisor as system actor so that it is
  stopped after ordinary actors
* when transport is shutdown send flush message to all
  outbound associations (over control stream) and wait for ack
  or timeout
2016-07-01 12:29:05 +02:00
Konrad Malawski
d99274a51f =art #20455 compression tables advertised as a whole "2 tables mode" (#20863)
Squashed commit of the following:

commit 6dc45364eb285338885bc8a5f1c4f293a29a53fb
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Fri Jun 24 19:03:58 2016 +0200

    =art moved successfuly to 2 table mode
    envelope format prepared, versioned tables

    2 table mode working

commit 517723c5d61969988a9a93b99666824bf5bccb52
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Fri Jun 24 10:28:02 2016 +0200

    WIP

commit 3e05a733e087e0d5bd8df9cc4fff0d4bc1314ec8
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Wed May 18 02:28:12 2016 +0200

commit b51f1766a94b202cd42fcc9d5402327ad0800d2d
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Thu Apr 28 10:56:58 2016 +0200

    +art #20455 HeavyHitters and CountMinSketch prepared for Compression
2016-07-01 11:54:57 +02:00
Konrad Malawski
7c79b40dea +tes introduce simple way to gather flamegraphs from multinode specs 2016-06-24 13:19:16 +02:00
Patrik Nordwall
b6a94e1758 fix bug in SystemMessageAcker, #20709 (#20792)
* sequence numbers must, of course, be tracked by
  origin system
* add unit test for SystemMessageAcker stage
* enable ArteryRemoteRoundRobinSpec
2016-06-23 16:36:55 +02:00
Konrad Malawski
e818887bb2 +art #20455 HeavyHitters, CountMinSketch => ActorRef Compression
* +art #20455 HeavyHitters and CountMinSketch prepared for Compression

* +art #20455 compression tables and integration specs
2016-06-23 11:58:54 +02:00
Patrik Nordwall
bdfbffcde5 port remaining remote multi-node tests to Artery 2016-06-12 17:17:18 +02:00
Patrik Nordwall
3eceb241e1 make cpu vs latency configurable, #20625
* the actual default values will be measured and tuned later
2016-06-10 16:08:10 +02:00
Patrik Nordwall
5c234940c6 make remote deployment work with Artery, #20715
There were two related problems with remote deployment when
using Artery.

* DaemonMsgCreate is not a SystemMessage, but must be sent over the control stream because
  remote deployment process depends on message ordering for DaemonMsgCreate and Watch messages.
  It must also be sent over the ordinary message stream so that it arrives (and creates the
  destination) before the first ordinary message arrives.
* The first point solves the creation of the remote deployed actor but it's not enough.
  Resolve of the recipient actor ref may still happen before the actor is created. This
  is solved by retrying the resolve for the first message of a remote deployed actor.
2016-06-10 15:15:57 +02:00
Johan Andrén
d6c048f59a A simpler ActorRefProvider config #20649 (#20767)
* Provide shorter aliases for the ActorRefProviders #20649
* Use the new actorefprovider aliases throughout code and docs
* Cleaner alias replacement logic
2016-06-10 15:04:13 +02:00
Patrik Nordwall
2e0986254c improve the test somewhat 2016-06-07 18:58:59 +02:00
Endre Sándor Varga
089dd86632 Initial AFR instrumentation 2016-06-07 11:55:24 +02:00
Patrik Nordwall
3465a221f0 format with new Scalariform version
* and fix mima issue
2016-06-03 12:56:49 +02:00
Patrik Nordwall
839ec5f167 Merge branch 'master' into wip-sync-artery-patriknw 2016-06-03 11:09:17 +02:00
Björn Antonsson
c66ce62d63 Update to a working version of Scalariform 2016-06-02 22:12:36 +02:00
Patrik Nordwall
aab46199fd port of some remote multi-node tests 2016-06-02 08:41:11 +02:00
Patrik Nordwall
e3afe6107d configuration of Artery materializer and dispatcher
* also increased the parallelism-max to 4 for default-remote-dispatcher
2016-06-01 11:59:13 +02:00
Patrik Nordwall
8fb7727526 make it possible to use external Aeron media driver, #20588 (#20653)
* make it possible to use external Aeron media driver, #20588

* on my machine the MaxThroughputSpec maxed out all 8 cores completely,
  and when using external media driver it is much better and easier to
  find the actual bottlenecks

* aeron.properties for external media driver
2016-06-01 11:56:18 +02:00
Patrik Nordwall
7505393c89 initiate new handshake after restart of receiving system, #20568
* we don't want to include the full origin address in each message,
  only the UID
* that means that the restarted receiving system can't initate a
  new handshake immediately when it sees message from unknown origin
* instead we inject HandshakeReq from the sending system once in a while
  (1 per second) which will trigger the new handshake
* any messages that arrives before the HandshakeReq are dropped, but
  that is fine since the system was just restarted anyway
* note that the injected handshake is only done for active connections,
  when a message is sent
* also changed the UID to a Long, but there are more places in old remoting
  that must be changed before we actually can use a Long value

fix lost first message, #20566

* the first message was sometimes dropped by the InboundHandshake stage
  because it came from unknown origin, i.e. the handshake had not completed
* that happended because the ordinary messagage arrived before the
  first HandshakeReq, which may happen since we sent the HandshakeReq
  over the control stream
* this changes so that HandshakeReq is sent over the same stream, not
  only on the control stream and thereby the HandshakeReq will arrive
  before any other message
* always send HandshakeReq as first message
  * also when the handshake on sender side has been completed at startup
  * moved code from preStart to onPull
2016-05-27 17:05:23 +02:00
Patrik Nordwall
c90121485f give up sending after a while, #20317 2016-05-20 13:51:39 +02:00