Commit graph

56 commits

Author SHA1 Message Date
Patrik Nordwall
1d3920d5db Merge pull request #21561 from akka/wip-sendTerminationHint-patriknw
harden shutdown exception in sendTerminationHint
2016-09-28 11:13:28 +02:00
Patrik Nordwall
a7b8f830d9 Merge pull request #21531 from akka/wip-21401-freeSessionBuffer-patriknw
freeSessionBuffer in AeronSource FragmentAssembler, #21401
2016-09-28 11:00:41 +02:00
Johan Andrén
8ae0c9a888 Use long uid in artery remoting and cluster #20644 2016-09-26 15:34:59 +02:00
Patrik Nordwall
ae860115ac harden shutdown exception in sendTerminationHint 2016-09-26 14:05:16 +02:00
Patrik Nordwall
1408a47e00 freeSessionBuffer in AeronSource FragmentAssembler, #21401 2016-09-23 13:08:02 +02:00
Endre Sándor Varga
1a6661f552 21400: Flush ordinary and control message streams 2016-09-23 11:19:43 +02:00
Patrik Nordwall
455d6a45cc fix shutdown race in sendControl, #21514 (#21517)
* fix shutdown race in sendControl, #21514

* the stack trace showed IllegalStateException: outboundControlIngress not initialized yet
  via the call to sendControl
* that could happen if there is a shutdown at the same time, which is exactly what the test does
* it was actually caused by a merge mistake, but now it got even better

* countDown latch on shutdown
2016-09-22 11:07:17 +02:00
Patrik Nordwall
1926560e41 stop outbound streams when quarantined, #21407
* they can't be stopped immediately because we want to send
  some final message and we reply to inbound messages with `Quarantined`
* and improve logging
2016-09-21 14:38:13 +02:00
Johan Andrén
0370acc121 Fix artery segfaults on termination (#21501) 2016-09-21 13:24:35 +02:00
Patrik Nordwall
76c23a7880 fix many bugs in InboundCompressions, #21464
* comprehensive integration test that revealed many bugs
* confirmations of manifests were wrong, at two places
* using wrong tables when system is restarted, including
  originUid in the tables with checks when receiving advertisments
* close (stop scheduling) of advertisments when new incarnation,
  quarantine, or restart
* cleanup how deadLetters ref was treated, and made it more robust
* make Decoder tolerant to decompression failures, can happen in
  case of system restart before handshake completed
* give up resending advertisment after a few attempts without confirmation,
  to avoid keeping outbound association open to possible dead system
* don't advertise new table when no inbound messages,
  to avoid keeping outbound association open to possible dead system
* HeaderBuilder could use manifest field from previous message, added
  resetMessageFields
* No compression for ArteryMessage, e.g. handshake messages must go
  through without depending on compression tables being in sync
* improve debug logging, including originUid
2016-09-19 11:37:44 +02:00
Patrik Nordwall
acafe80cf1 rewrite TestStage to use thread-safe shared state, #21431
* The previous approach was based on sending the
  test commands to the active stages themselves and let
  them keep track of the state.
* The problem with that is that Association/OutboundTestStage
  that is created afterwards will not have the right state.
  Similar problems can occur for restarts.
* Instead using thread-safe mutable state that is
  updated directly and used by all test stages.
2016-09-12 19:51:05 +02:00
Patrik Nordwall
1584c52190 handle longer network partitions, #21399
* system messages in flight should not trigger premature quarantine
  in case of longer network partitions, therefore we keep the control
  stream alive
* add give-up-system-message-after property that is used by both
  SystemMessageDelivery and AeronSink in the control stream
* also unwrap SystemMessageEnvelope in RemoteDeadLetterActorRef
* skip sending control messages after shutdown, can be triggered
  by scheduled compression advertisment
2016-09-09 14:35:50 +02:00
Patrik Nordwall
cd4a31e74d No ack delivery for prio messages, #21371
* and send prio messages enclosed in actor selection
  over the control stream
2016-09-09 14:35:50 +02:00
Patrik Nordwall
494ccc00dc add recover in front of MergeHub, to avoid logging, #21397 2016-09-08 19:34:18 +02:00
Patrik Nordwall
74a8bb3a00 flight recorder event for send queue overflow 2016-09-08 18:00:37 +02:00
Patrik Nordwall
8756ffd75c handle Aeron Publication.CLOSED 2016-09-08 18:00:37 +02:00
Patrik Nordwall
3c779cebd4 config of send queues 2016-09-08 18:00:37 +02:00
Patrik Nordwall
ebd1883df5 remove or reword obsolete fixme 2016-09-08 18:00:37 +02:00
Patrik Nordwall
85be571af7 Merge pull request #21376 from akka/wip-21347-restart-patriknw
fix glitch in lazy restart, #21347
2016-09-07 11:15:51 +02:00
Patrik Nordwall
f1e4e7a657 Merge pull request #21383 from akka/wip-21381-killSwitch-patriknw
add missing killSwitch for parallel outbound lanes, #21381
2016-09-07 11:15:01 +02:00
Patrik Nordwall
9fd359042a add missing killSwitch for parallel outbound lanes, #21381
* it caused the shutdown to stall, since the part after MergeHub
  was never stopped
* tear down parts upstream and downstream of the hub toghether
2016-09-07 09:10:30 +02:00
Patrik Nordwall
edf1c83839 abort streams on shutdown, #21388
* otherwise AeronSink will continue sending outstanding messages
  before completing
* this was noticed by RemoteDeathWatchSpec couldn't shutdown,
  since it was trying to send to unknown
2016-09-07 08:27:33 +02:00
Patrik Nordwall
294947a9a2 fix glitch in lazy restart, #21347 2016-09-06 15:57:12 +02:00
Johan Andrén
9287a28702 Artery transport shutdown improvements (#21357)
* Make sure streams have stopped before shutting down aeron etc
* Log completion failures rather than failing shutdown
2016-09-06 11:50:10 +02:00
Patrik Nordwall
9d89810674 make restart materialization of outbound streams lazy, #21347
* Materialize on first message instead, otherwise handshake attempts
  to non-existing nodes will continue forever

* also fix HandshakeFailureSpec
2016-09-05 13:27:18 +02:00
Patrik Nordwall
432086b3f4 improve deadLetters and logging when send queue overflow (#21355) 2016-09-05 12:42:46 +02:00
Patrik Nordwall
faf941b4c8 support for parallel lanes, #21207
* for parallel serialziation/deserialization
* MergeHub for the outbound lanes
* BroadcastHub + filter for the inbound lanes, until we
  have a PartitionHub
* simplify materialization of test stage
* add RemoteSendConsistencyWithThreeLanesSpec
2016-09-05 12:42:33 +02:00
Martynas Mickevičius
292face28a #20587 Clean artery configuration (#21279)
* Move artery settings from remoting settings to dedicated class.
* #20587 Move hardcoded settings to configuration file.
* Copy reused settings from remote to the artery
2016-09-01 08:07:39 +02:00
Patrik Nordwall
0c0e3c5efd Refactoring of outbound compression, #21210
* outbound compression is now immutable, by simply using
  CompressionTable[ActorRef] and CompressionTable[String]
* immutable outbound compression will make it possible to use
  them from multiple Encoder instances, when we add several lanes
  for parallel serialization
* outbound compression tables not shared via AssociationState
* the advertised tables are sent to the Encoder stage via async
  callback, no need to reference the tables in other places than
  the Encoder stage, no more races via shared mutable state
* when outbound stream is started or restarted it can start out
  without compression, until next advertisement is received
* ensure outbound compression is cleared before handshake is signaled complete
2016-08-26 15:21:03 +02:00
Johan Andrén
af5eb4c6bf WIP separate prio artery channel (#21278)
* First incorrect stab - separate prio channel

* Send prio messages over the control stream
2016-08-26 14:44:33 +02:00
Patrik Nordwall
21a4899054 use the new WildcardIndex 2016-08-23 20:38:39 +02:00
Patrik Nordwall
5e90d4db40 =art place OutboundTestStage after SystemMessageDelivery stage (#20899)
* failing test was akka.cluster.AttemptSysMsgRedelivery when
  running with Artery
* we rely on that system messages are not dropped before
  the redelivery stage, i.e. blackhole must be after that
2016-07-08 01:00:41 +02:00
Konrad Malawski
d1015c1dc6 Compression tables properly *used* for Outgoing Compression (#20874)
* =art now correctly compresses and 2 table mode working
* =art AGRESSIVELY optimising hashing, not convienved about correctness yet
* fix HandshakeShouldDropCompressionTableSpec
2016-07-04 16:48:11 +02:00
Patrik Nordwall
b2089d06a7 new OutboundEnvelope
* instead of the old Send
* optional recipient, remove of dummy
* pool of OutboundEnvelope
2016-07-01 14:06:48 +02:00
Patrik Nordwall
a021eb5ff4 flush messages on shutdown, #20811
* StreamSupervisor as system actor so that it is
  stopped after ordinary actors
* when transport is shutdown send flush message to all
  outbound associations (over control stream) and wait for ack
  or timeout
2016-07-01 12:29:05 +02:00
Konrad Malawski
d99274a51f =art #20455 compression tables advertised as a whole "2 tables mode" (#20863)
Squashed commit of the following:

commit 6dc45364eb285338885bc8a5f1c4f293a29a53fb
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Fri Jun 24 19:03:58 2016 +0200

    =art moved successfuly to 2 table mode
    envelope format prepared, versioned tables

    2 table mode working

commit 517723c5d61969988a9a93b99666824bf5bccb52
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Fri Jun 24 10:28:02 2016 +0200

    WIP

commit 3e05a733e087e0d5bd8df9cc4fff0d4bc1314ec8
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Wed May 18 02:28:12 2016 +0200

commit b51f1766a94b202cd42fcc9d5402327ad0800d2d
Author: Konrad Malawski <konrad.malawski@project13.pl>
Date:   Thu Apr 28 10:56:58 2016 +0200

    +art #20455 HeavyHitters and CountMinSketch prepared for Compression
2016-07-01 11:54:57 +02:00
Konrad Malawski
e818887bb2 +art #20455 HeavyHitters, CountMinSketch => ActorRef Compression
* +art #20455 HeavyHitters and CountMinSketch prepared for Compression

* +art #20455 compression tables and integration specs
2016-06-23 11:58:54 +02:00
Patrik Nordwall
5c234940c6 make remote deployment work with Artery, #20715
There were two related problems with remote deployment when
using Artery.

* DaemonMsgCreate is not a SystemMessage, but must be sent over the control stream because
  remote deployment process depends on message ordering for DaemonMsgCreate and Watch messages.
  It must also be sent over the ordinary message stream so that it arrives (and creates the
  destination) before the first ordinary message arrives.
* The first point solves the creation of the remote deployed actor but it's not enough.
  Resolve of the recipient actor ref may still happen before the actor is created. This
  is solved by retrying the resolve for the first message of a remote deployed actor.
2016-06-10 15:15:57 +02:00
Patrik Nordwall
7ce6dffabf send dropped system messages to deadLetters
* publish remote lifecycle event for quarantined
2016-06-10 13:21:17 +02:00
Patrik Nordwall
7a1a316e8a reduce allocations with specialized ImmutableLongMap (#20750)
* reduce allocations with specialized ImmutableLongMap

* backed by arrays, allocation free lookups with binary search
* use it for UID -> Association Map
* pass Association in InboundEnvelope to reduce to only
  one lookup per incoming message
* use ImmutableLongMap instead of the QuarantinedUIDSet
2016-06-10 13:04:23 +02:00
Patrik Nordwall
a814034342 Option value class, to avoid allocations for optional sender 2016-06-07 18:58:59 +02:00
Patrik Nordwall
c808522f6d optimize access to association UniqueAddress 2016-06-07 18:58:58 +02:00
Patrik Nordwall
ea231b1cbc test support for blackhole in Artery, #20589 2016-06-07 15:47:12 +02:00
Patrik Nordwall
d236b8e152 new queue Source for remote sends
* new SendQueue Source based on agrona ManyToOneConcurrentArrayQueue
* jmh benchmark for send queue
* JMH benchmark for Source.queue, Source.actorRef and the new SendQueue
* inject the queue so that we can start sending to it before materialization
* Get rid of computeIfAbsent in the AssociationRegistry
  by making it possible to send (enque) messages to the
  Association instance immediatly after construction.
2016-06-03 17:23:19 +02:00
Patrik Nordwall
3465a221f0 format with new Scalariform version
* and fix mima issue
2016-06-03 12:56:49 +02:00
Patrik Nordwall
7505393c89 initiate new handshake after restart of receiving system, #20568
* we don't want to include the full origin address in each message,
  only the UID
* that means that the restarted receiving system can't initate a
  new handshake immediately when it sees message from unknown origin
* instead we inject HandshakeReq from the sending system once in a while
  (1 per second) which will trigger the new handshake
* any messages that arrives before the HandshakeReq are dropped, but
  that is fine since the system was just restarted anyway
* note that the injected handshake is only done for active connections,
  when a message is sent
* also changed the UID to a Long, but there are more places in old remoting
  that must be changed before we actually can use a Long value

fix lost first message, #20566

* the first message was sometimes dropped by the InboundHandshake stage
  because it came from unknown origin, i.e. the handshake had not completed
* that happended because the ordinary messagage arrived before the
  first HandshakeReq, which may happen since we sent the HandshakeReq
  over the control stream
* this changes so that HandshakeReq is sent over the same stream, not
  only on the control stream and thereby the HandshakeReq will arrive
  before any other message
* always send HandshakeReq as first message
  * also when the handshake on sender side has been completed at startup
  * moved code from preStart to onPull
2016-05-27 17:05:23 +02:00
Patrik Nordwall
5b7c978844 add JMH benchmark for encoder decoder stage
* CodecBenchmark that tests encode, decode and combined
  encode + decode
* refactoring of codec stages to make it possible to
  run them without real ArteryTransport
* also fixed a bug in inbound stream for large messages,
  it was using wrong envelope pool
2016-05-27 12:21:30 +02:00
Patrik Nordwall
e9e65c463f improve restart logging 2016-05-20 13:51:39 +02:00
Patrik Nordwall
c90121485f give up sending after a while, #20317 2016-05-20 13:51:39 +02:00
Johan Andrén
cd71643a91 [WIP] Large message stream for Artery (#20545)
* First stab at separate large message channel for Artery

* Full actor paths, no implicit "/user/" part

* Various small fixes after review

* Fixes to make it work after rebasing

* Use a separate EnvelopeBufferPool for the large message stream

* Docs for actorSelection not sending through large message stream
2016-05-20 12:40:56 +02:00