* fix NPE in shutdownTransport
* perhaps because shutdown before started
* system.dispatcher is used in other places of the shutdown
* improve logging of compression advertisment progress
* adjust RestartFlow.withBackoff parameters
* quarantine after ActorSystemTerminating signal
(will cleanup compressions)
* Quarantine idle associations
* liveness checks by sending extra HandshakeReq and update the
lastUsed when reply received
* concervative default value to survive network partition, in
case no other messages are sent
* Adjust logging and QuarantinedEvent for harmless quarantine
* Harmless if it was via the shutdown signal or cluster leaving
* The ThreadLocal Serialization.currentTransportInformation is used for serializing local
actor refs, but it's also useful when a serializer library e.g. custom serializer/deserializer
in Jackson need access to the current ActorSystem.
* We set this in a rather ad-hoc way from remoting and in some persistence plugins, but it's only
set for serialization and not deserialization, and it's easy for Persistence plugins or other
libraries to forget this when using Akka serialization directly.
* This change is automatically setting the info when using the ordinary serialize and deserialize
methods.
* It's also set when LocalActorRefProvider, which wasn't always the case previously.
* Keep a cached instance of Serialization.Information in the provider to avoid
creating new instances all the time.
* Added optional Persistence TCK tests to verify that the plugin is setting this
if it's using some custom calls to the serializer.
* Notice that the incarnation has changed in SystemMessageDelivery
and then reset the sequence number
* Take the incarnation number into account in the ClearSystemMessageDelivery
message
* Trigger quarantine earlier in ClusterRemoteWatcher if node with
same host:port joined
* Change quarantine-removed-node-after to 5s, shouldn't be necessary
to delay it 30s
* test reproducer
* fix memory leak in SystemMessageDelivery
* initial set of tests for idle outbound associations, credit to mboogerd
* close inbound compression when quarantined, #23967
* make sure compressions for quarantined are removed in case they are lingering around
* also means that advertise will not be done for quarantined
* remove tombstone in InboundCompressions
* simplify async callbacks by using invokeWithFeedback
* compression for old incarnation, #24400
* it was fixed by the other previous changes
* also confirmed by running the SimpleClusterApp with TCP
as described in the ticket
* test with tcp and tls-tcp transport
* handle the stop signals differently for tcp transport because they
are converted to StreamTcpException
* cancel timers on shutdown
* share the top-level FR for all Association instances
* use linked queue for control and large streams, less memory usage
* remove quarantined idle Association completely after a configured delay
* note that shallow Association instances may still lingering in the
heap because of cached references from RemoteActorRef, which may
be cached by LruBoundedCache (used by resolve actor ref).
Those are small, since the queues have been removed, and the cache
is bounded.
* Refactoring to separate the Aeron specific things, ArteryAeronUdpTransport
* move Aeron specific classes to akka.remote.artery.aeron package
* move Version to ArterySettings, and describe strategy for envelope header changes
* DaemonMsgCreate is not a system message. We send it over the control
stream because remote deployment process depends on message ordering
for DaemonMsgCreate and Watch messages. That is all good.
* We also send DaemonMsgCreate over the ordinary message stream (all
outbound lanes) so that the first ordinary message that is sent to
the ref does not arrive before the actor is created. This is not needed,
since the retried resolve in the Decoder will take care of that anyway.
* Inbound lanes were not covered, but not needed.
* Then the deduplication of DaemonMsgCreate messages in RemoteSystemDaemon
is not needed.
* Added some more tests for these things.
* describe lanes in reference docs
* When the artery stream with PartitionHub is restarted it can result in that
some lanes are removed while it is still processing messages, resulting in
IndexOutOfBoundsException
* Added possibility to drop messages in PartitionHub, which is then used Artery
* Some race conditions in SurviveInboundStreamRestartWithCompressionInFlightSpec
when using inbound-lanes > 1
* The killSwitch in Artery was supposed to be triggered when one lane failed,
but since it used Future.sequence that was never triggered unless it was the
first lane that failed. Changed to firstCompletedOf.
* WIP early preview of moving compressions ownership to Decoder
* Compression table created in transport, but owned by Decoder
Added test for restart of inbound stream
* =art snapshot not needed in HeavyHitters since owned by Decoder
* fix shutdown race in sendControl, #21514
* the stack trace showed IllegalStateException: outboundControlIngress not initialized yet
via the call to sendControl
* that could happen if there is a shutdown at the same time, which is exactly what the test does
* it was actually caused by a merge mistake, but now it got even better
* countDown latch on shutdown
* they can't be stopped immediately because we want to send
some final message and we reply to inbound messages with `Quarantined`
* and improve logging
* comprehensive integration test that revealed many bugs
* confirmations of manifests were wrong, at two places
* using wrong tables when system is restarted, including
originUid in the tables with checks when receiving advertisments
* close (stop scheduling) of advertisments when new incarnation,
quarantine, or restart
* cleanup how deadLetters ref was treated, and made it more robust
* make Decoder tolerant to decompression failures, can happen in
case of system restart before handshake completed
* give up resending advertisment after a few attempts without confirmation,
to avoid keeping outbound association open to possible dead system
* don't advertise new table when no inbound messages,
to avoid keeping outbound association open to possible dead system
* HeaderBuilder could use manifest field from previous message, added
resetMessageFields
* No compression for ArteryMessage, e.g. handshake messages must go
through without depending on compression tables being in sync
* improve debug logging, including originUid
* The previous approach was based on sending the
test commands to the active stages themselves and let
them keep track of the state.
* The problem with that is that Association/OutboundTestStage
that is created afterwards will not have the right state.
Similar problems can occur for restarts.
* Instead using thread-safe mutable state that is
updated directly and used by all test stages.
* system messages in flight should not trigger premature quarantine
in case of longer network partitions, therefore we keep the control
stream alive
* add give-up-system-message-after property that is used by both
SystemMessageDelivery and AeronSink in the control stream
* also unwrap SystemMessageEnvelope in RemoteDeadLetterActorRef
* skip sending control messages after shutdown, can be triggered
by scheduled compression advertisment
* otherwise AeronSink will continue sending outstanding messages
before completing
* this was noticed by RemoteDeathWatchSpec couldn't shutdown,
since it was trying to send to unknown
* for parallel serialziation/deserialization
* MergeHub for the outbound lanes
* BroadcastHub + filter for the inbound lanes, until we
have a PartitionHub
* simplify materialization of test stage
* add RemoteSendConsistencyWithThreeLanesSpec
* Move artery settings from remoting settings to dedicated class.
* #20587 Move hardcoded settings to configuration file.
* Copy reused settings from remote to the artery
* outbound compression is now immutable, by simply using
CompressionTable[ActorRef] and CompressionTable[String]
* immutable outbound compression will make it possible to use
them from multiple Encoder instances, when we add several lanes
for parallel serialization
* outbound compression tables not shared via AssociationState
* the advertised tables are sent to the Encoder stage via async
callback, no need to reference the tables in other places than
the Encoder stage, no more races via shared mutable state
* when outbound stream is started or restarted it can start out
without compression, until next advertisement is received
* ensure outbound compression is cleared before handshake is signaled complete