Commit graph

76 commits

Author SHA1 Message Date
Patrik Nordwall
d5b2aea176
Merge pull request #25035 from piotrromanski/wip-fix-math-abs-usage
Handle a negative value returned by Math.abs()
2018-08-27 16:29:32 +02:00
Kazuhiro Sera
482eaea122 Fix several minor typos detected by github.com/client9/misspell (#25448)
* Fix several minor typos detected by github.com/client9/misspell

* Revert s/erminater/erminator/ in /ActorSystemSpec
2018-08-21 11:02:37 +09:00
Patrik Nordwall
3439377816
Merge pull request #24983 from akka/wip-24972-tcp-restart-patriknw
Quarantine and cleanup idle associations, #24972
2018-05-22 13:11:36 +02:00
Patrik Nordwall
7fc7744049 Quarantine and cleanup idle associations, #24972
* fix NPE in shutdownTransport
  * perhaps because shutdown before started
  * system.dispatcher is used in other places of the shutdown
* improve logging of compression advertisment progress
* adjust RestartFlow.withBackoff parameters
* quarantine after ActorSystemTerminating signal
  (will cleanup compressions)
* Quarantine idle associations
  * liveness checks by sending extra HandshakeReq and update the
    lastUsed when reply received
  * concervative default value to survive network partition, in
    case no other messages are sent
* Adjust logging and QuarantinedEvent for harmless quarantine
  * Harmless if it was via the shutdown signal or cluster leaving
2018-05-22 13:10:30 +02:00
Patrik Nordwall
e6633f17fa
Make sure Serialization.currentTransportInformation is always set, #25067
* The ThreadLocal Serialization.currentTransportInformation is used for serializing local
  actor refs, but it's also useful when a serializer library e.g. custom serializer/deserializer
  in Jackson need access to the current ActorSystem.
* We set this in a rather ad-hoc way from remoting and in some persistence plugins, but it's only
  set for serialization and not deserialization, and it's easy for Persistence plugins or other
  libraries to forget this when using Akka serialization directly.
* This change is automatically setting the info when using the ordinary serialize and deserialize
  methods.
* It's also set when LocalActorRefProvider, which wasn't always the case previously.
* Keep a cached instance of Serialization.Information in the provider to avoid
  creating new instances all the time.
* Added optional Persistence TCK tests to verify that the plugin is setting this
  if it's using some custom calls to the serializer.
2018-05-21 16:59:04 +02:00
promanski
05282b59c9 Handle a negative value returned by Math.abs() #25034 2018-05-05 13:49:20 +02:00
Patrik Nordwall
43dc381d59
Clear system messages sequence number for restarted node, #24847
* Notice that the incarnation has changed in SystemMessageDelivery
  and then reset the sequence number
* Take the incarnation number into account in the ClearSystemMessageDelivery
  message
* Trigger quarantine earlier in ClusterRemoteWatcher if node with
  same host:port joined
* Change quarantine-removed-node-after to 5s, shouldn't be necessary
  to delay it 30s
* test reproducer
2018-04-10 11:39:55 +02:00
Konrad `ktoso` Malawski
563c7fbcf0 Issue 24594: Integration with sbt-headers and initial header population 2018-03-13 15:45:55 +01:00
Patrik Nordwall
5e80bd97f2 Stop unused Artery outbound streams, #23967
* fix memory leak in SystemMessageDelivery
* initial set of tests for idle outbound associations, credit to mboogerd
* close inbound compression when quarantined, #23967
  * make sure compressions for quarantined are removed in case they are lingering around
  * also means that advertise will not be done for quarantined
  * remove tombstone in InboundCompressions
* simplify async callbacks by using invokeWithFeedback
* compression for old incarnation, #24400
  * it was fixed by the other previous changes
  * also confirmed by running the SimpleClusterApp with TCP
    as described in the ticket
* test with tcp and tls-tcp transport
  * handle the stop signals differently for tcp transport because they
    are converted to StreamTcpException
* cancel timers on shutdown
* share the top-level FR for all Association instances
* use linked queue for control and large streams, less memory usage
* remove quarantined idle Association completely after a configured delay
  * note that shallow Association instances may still lingering in the
    heap because of cached references from RemoteActorRef, which may
    be cached by LruBoundedCache (used by resolve actor ref).
    Those are small, since the queues have been removed, and the cache
    is bounded.
2018-02-21 11:59:18 +01:00
Patrik Nordwall
0d222906f4 Prepare Artery for alternative TCP transport, #24390
* Refactoring to separate the Aeron specific things, ArteryAeronUdpTransport
* move Aeron specific classes to akka.remote.artery.aeron package
* move Version to ArterySettings, and describe strategy for envelope header changes
2018-02-20 16:02:57 +01:00
Christopher Batey
009214ae07
Update copyright to 2018 (#24241) 2018-01-04 17:26:29 +00:00
Björn Antonsson
cb6a660cf4 Don't create multiple outbound envelopes (#24045) 2017-11-23 09:05:58 +01:00
Patrik Nordwall
6b41c80f9b Simplify Artery remote deployment and make inbound-lanes=4 default, #21422
* DaemonMsgCreate is not a system message. We send it over the control
  stream because remote deployment process depends on message ordering
  for DaemonMsgCreate and Watch messages. That is all good.
* We also send DaemonMsgCreate over the ordinary message stream (all
  outbound lanes) so that the first ordinary message that is sent to
  the ref does not arrive before the actor is created. This is not needed,
  since the retried resolve in the Decoder will take care of that anyway.
* Inbound lanes were not covered, but not needed.
* Then the deduplication of DaemonMsgCreate messages in  RemoteSystemDaemon
  is not needed.
* Added some more tests for these things.
* describe lanes in reference docs
2017-11-11 10:30:39 +01:00
Patrik Nordwall
fc75f78468 Harden restart of Artery stream with inbound-lanes > 1, #23561
* When the artery stream with PartitionHub is restarted it can result in that
  some lanes are removed while it is still processing messages, resulting in
  IndexOutOfBoundsException
* Added possibility to drop messages in PartitionHub, which is then used Artery
* Some race conditions in SurviveInboundStreamRestartWithCompressionInFlightSpec
  when using inbound-lanes > 1
* The killSwitch in Artery was supposed to be triggered when one lane failed,
  but since it used Future.sequence that was never triggered unless it was the
  first lane that failed. Changed to firstCompletedOf.
2017-10-24 14:34:39 +02:00
Björn Antonsson
f8b4fb55ca Remove use of deprecated Scala features #22581 2017-03-27 19:05:54 +03:00
Johan Andrén
7a0e5b31f8 Avoid Array.ofDim where possible #22516 2017-03-13 17:49:45 +01:00
Konrad `ktoso` Malawski
dcd8cea32e #21475 moving compressions ownership to Decoder (#22047)
* WIP early preview of moving compressions ownership to Decoder

* Compression table created in transport, but owned by Decoder
Added test for restart of inbound stream

* =art snapshot not needed in HeavyHitters since owned by Decoder
2017-01-13 10:33:55 +01:00
Philippus Baalman
6c7085252a extended copyright into 2017 2017-01-04 17:37:15 +01:00
Patrik Nordwall
74df8226de add/change private visibility 2016-09-29 11:30:34 +02:00
Patrik Nordwall
00c5895e77 config of control stream dispatcher 2016-09-29 08:44:49 +02:00
Patrik Nordwall
1d3920d5db Merge pull request #21561 from akka/wip-sendTerminationHint-patriknw
harden shutdown exception in sendTerminationHint
2016-09-28 11:13:28 +02:00
Patrik Nordwall
a7b8f830d9 Merge pull request #21531 from akka/wip-21401-freeSessionBuffer-patriknw
freeSessionBuffer in AeronSource FragmentAssembler, #21401
2016-09-28 11:00:41 +02:00
Johan Andrén
8ae0c9a888 Use long uid in artery remoting and cluster #20644 2016-09-26 15:34:59 +02:00
Patrik Nordwall
ae860115ac harden shutdown exception in sendTerminationHint 2016-09-26 14:05:16 +02:00
Patrik Nordwall
1408a47e00 freeSessionBuffer in AeronSource FragmentAssembler, #21401 2016-09-23 13:08:02 +02:00
Endre Sándor Varga
1a6661f552 21400: Flush ordinary and control message streams 2016-09-23 11:19:43 +02:00
Patrik Nordwall
455d6a45cc fix shutdown race in sendControl, #21514 (#21517)
* fix shutdown race in sendControl, #21514

* the stack trace showed IllegalStateException: outboundControlIngress not initialized yet
  via the call to sendControl
* that could happen if there is a shutdown at the same time, which is exactly what the test does
* it was actually caused by a merge mistake, but now it got even better

* countDown latch on shutdown
2016-09-22 11:07:17 +02:00
Patrik Nordwall
1926560e41 stop outbound streams when quarantined, #21407
* they can't be stopped immediately because we want to send
  some final message and we reply to inbound messages with `Quarantined`
* and improve logging
2016-09-21 14:38:13 +02:00
Johan Andrén
0370acc121 Fix artery segfaults on termination (#21501) 2016-09-21 13:24:35 +02:00
Patrik Nordwall
76c23a7880 fix many bugs in InboundCompressions, #21464
* comprehensive integration test that revealed many bugs
* confirmations of manifests were wrong, at two places
* using wrong tables when system is restarted, including
  originUid in the tables with checks when receiving advertisments
* close (stop scheduling) of advertisments when new incarnation,
  quarantine, or restart
* cleanup how deadLetters ref was treated, and made it more robust
* make Decoder tolerant to decompression failures, can happen in
  case of system restart before handshake completed
* give up resending advertisment after a few attempts without confirmation,
  to avoid keeping outbound association open to possible dead system
* don't advertise new table when no inbound messages,
  to avoid keeping outbound association open to possible dead system
* HeaderBuilder could use manifest field from previous message, added
  resetMessageFields
* No compression for ArteryMessage, e.g. handshake messages must go
  through without depending on compression tables being in sync
* improve debug logging, including originUid
2016-09-19 11:37:44 +02:00
Patrik Nordwall
acafe80cf1 rewrite TestStage to use thread-safe shared state, #21431
* The previous approach was based on sending the
  test commands to the active stages themselves and let
  them keep track of the state.
* The problem with that is that Association/OutboundTestStage
  that is created afterwards will not have the right state.
  Similar problems can occur for restarts.
* Instead using thread-safe mutable state that is
  updated directly and used by all test stages.
2016-09-12 19:51:05 +02:00
Patrik Nordwall
1584c52190 handle longer network partitions, #21399
* system messages in flight should not trigger premature quarantine
  in case of longer network partitions, therefore we keep the control
  stream alive
* add give-up-system-message-after property that is used by both
  SystemMessageDelivery and AeronSink in the control stream
* also unwrap SystemMessageEnvelope in RemoteDeadLetterActorRef
* skip sending control messages after shutdown, can be triggered
  by scheduled compression advertisment
2016-09-09 14:35:50 +02:00
Patrik Nordwall
cd4a31e74d No ack delivery for prio messages, #21371
* and send prio messages enclosed in actor selection
  over the control stream
2016-09-09 14:35:50 +02:00
Patrik Nordwall
494ccc00dc add recover in front of MergeHub, to avoid logging, #21397 2016-09-08 19:34:18 +02:00
Patrik Nordwall
74a8bb3a00 flight recorder event for send queue overflow 2016-09-08 18:00:37 +02:00
Patrik Nordwall
8756ffd75c handle Aeron Publication.CLOSED 2016-09-08 18:00:37 +02:00
Patrik Nordwall
3c779cebd4 config of send queues 2016-09-08 18:00:37 +02:00
Patrik Nordwall
ebd1883df5 remove or reword obsolete fixme 2016-09-08 18:00:37 +02:00
Patrik Nordwall
85be571af7 Merge pull request #21376 from akka/wip-21347-restart-patriknw
fix glitch in lazy restart, #21347
2016-09-07 11:15:51 +02:00
Patrik Nordwall
f1e4e7a657 Merge pull request #21383 from akka/wip-21381-killSwitch-patriknw
add missing killSwitch for parallel outbound lanes, #21381
2016-09-07 11:15:01 +02:00
Patrik Nordwall
9fd359042a add missing killSwitch for parallel outbound lanes, #21381
* it caused the shutdown to stall, since the part after MergeHub
  was never stopped
* tear down parts upstream and downstream of the hub toghether
2016-09-07 09:10:30 +02:00
Patrik Nordwall
edf1c83839 abort streams on shutdown, #21388
* otherwise AeronSink will continue sending outstanding messages
  before completing
* this was noticed by RemoteDeathWatchSpec couldn't shutdown,
  since it was trying to send to unknown
2016-09-07 08:27:33 +02:00
Patrik Nordwall
294947a9a2 fix glitch in lazy restart, #21347 2016-09-06 15:57:12 +02:00
Johan Andrén
9287a28702 Artery transport shutdown improvements (#21357)
* Make sure streams have stopped before shutting down aeron etc
* Log completion failures rather than failing shutdown
2016-09-06 11:50:10 +02:00
Patrik Nordwall
9d89810674 make restart materialization of outbound streams lazy, #21347
* Materialize on first message instead, otherwise handshake attempts
  to non-existing nodes will continue forever

* also fix HandshakeFailureSpec
2016-09-05 13:27:18 +02:00
Patrik Nordwall
432086b3f4 improve deadLetters and logging when send queue overflow (#21355) 2016-09-05 12:42:46 +02:00
Patrik Nordwall
faf941b4c8 support for parallel lanes, #21207
* for parallel serialziation/deserialization
* MergeHub for the outbound lanes
* BroadcastHub + filter for the inbound lanes, until we
  have a PartitionHub
* simplify materialization of test stage
* add RemoteSendConsistencyWithThreeLanesSpec
2016-09-05 12:42:33 +02:00
Martynas Mickevičius
292face28a #20587 Clean artery configuration (#21279)
* Move artery settings from remoting settings to dedicated class.
* #20587 Move hardcoded settings to configuration file.
* Copy reused settings from remote to the artery
2016-09-01 08:07:39 +02:00
Patrik Nordwall
0c0e3c5efd Refactoring of outbound compression, #21210
* outbound compression is now immutable, by simply using
  CompressionTable[ActorRef] and CompressionTable[String]
* immutable outbound compression will make it possible to use
  them from multiple Encoder instances, when we add several lanes
  for parallel serialization
* outbound compression tables not shared via AssociationState
* the advertised tables are sent to the Encoder stage via async
  callback, no need to reference the tables in other places than
  the Encoder stage, no more races via shared mutable state
* when outbound stream is started or restarted it can start out
  without compression, until next advertisement is received
* ensure outbound compression is cleared before handshake is signaled complete
2016-08-26 15:21:03 +02:00
Johan Andrén
af5eb4c6bf WIP separate prio artery channel (#21278)
* First incorrect stab - separate prio channel

* Send prio messages over the control stream
2016-08-26 14:44:33 +02:00