* DaemonMsgCreate is not a system message. We send it over the control
stream because remote deployment process depends on message ordering
for DaemonMsgCreate and Watch messages. That is all good.
* We also send DaemonMsgCreate over the ordinary message stream (all
outbound lanes) so that the first ordinary message that is sent to
the ref does not arrive before the actor is created. This is not needed,
since the retried resolve in the Decoder will take care of that anyway.
* Inbound lanes were not covered, but not needed.
* Then the deduplication of DaemonMsgCreate messages in RemoteSystemDaemon
is not needed.
* Added some more tests for these things.
* describe lanes in reference docs
* Pass HandshakeReq in all inbound lanes, #23527
The HandshakeReq message must be passed in each inbound lane to
ensure that it arrives before any application message. Otherwise there is a risk
that an application message arrives in the InboundHandshake stage before the
handshake is completed and then it would be dropped.
* mima
When shutting down, we compliment the addition of a shutdown hook during startup with its removal. Doing so further ensures that no class loader is retained when unloading Akka in an OSGi style scenario.
* When the artery stream with PartitionHub is restarted it can result in that
some lanes are removed while it is still processing messages, resulting in
IndexOutOfBoundsException
* Added possibility to drop messages in PartitionHub, which is then used Artery
* Some race conditions in SurviveInboundStreamRestartWithCompressionInFlightSpec
when using inbound-lanes > 1
* The killSwitch in Artery was supposed to be triggered when one lane failed,
but since it used Future.sequence that was never triggered unless it was the
first lane that failed. Changed to firstCompletedOf.
* Retry creation of ActorSystem in remoting tests #23481
Remoting multi-jvm test rely on setting port = 0 which selects an open
port. This has a race where two of the JVMs open/close the same port
then configure their ActorSystem with it so one of them fails to start
due to the port being in use. This adds a simple retry so another port
is selected.
* The scenario described in the issue can cause the quarantine marker to
be lost when creating a new endpoint for that address. Then when later
creating another endpoint from an inbound connection the uid is considered
confirmed and Ack message is accepted, triggering the unexpected seq number
issue.
* The refuseUid was kept in the endpoint policy markers, but that is just very
complicated and as illustrated by this issue not always safe.
* Instead, keep the refuseUid separately so it's not lost when registering
new endpoint.
* The purpose of WasGated was only to try to keep the refuseUid (as far as I know),
and that is not needed any longer.
mima filter
* Hand out temporary loopback addresses from a larger pool
Might have prevented https://github.com/akka/akka/issues/23528
* Detect if binding on other loopback addresses works
Works for Linux, not for OSX
* Avoid hard-coding 'localhost' in more places
When the address is ignored and only a port is requested,
find a free port on 'localhost'
* Looked into the alternativies described in the ticket, but
they were complicated so ended up with simply including the
uid of the sending system in the hash for selecting inbound
lane. That should be good enough, until we have any real demand
for something else.
* This means that different lanes can be used ActorSelection messages
from different sending systems, i.e. good in a cluster, but same lane
will be used for all messages originating from the same system.
* Added possibility to run the benchmarks with ActorSelection
* Added ActorSelection to the send consistency test
Reproducer (TransportFailSpec):
* watch from first to second node, i.e. sys msg with seq number 1
* trigger transport failure detection to tear down the connection
* the bug was that on the second node the ReliableDeliverySupervisor
was stopped because the send buffer had not been used on that side,
but that removed the receive buffer entry
* later, after gating elapsed another watch from first to second node,
i.e. sys msg with seq number 2
* when that watch msg was received on the second node the receive buffer
had been cleared and therefore it thought that seq number 1 was missing,
and therefore sent nack to the first node
* when first node received the nack it thrown
IllegalStateException: Error encountered while processing system message
acknowledgement buffer: [2 {2}] ack: ACK[2, {1, 0}]
caused by: ResendUnfulfillableException: Unable to fulfill resend request since
negatively acknowledged payload is no longer in buffer
This was fixed by not stopping the ReliableDeliverySupervisor so that the
receive buffer was preserved.
Not necessary for fixing the issue, but the following config settings were adjusted:
* increased transport-failure-detector timeout to avoid tearing down the
connection too early
* reduce the quarantine-after-silence to cleanup ReliableDeliverySupervisor
actors earlier