In the large outbound flow EnvelopeBuffers aquired by Encoder must be
returned to the same buffer pool by the AeronSink. Otherwise one of
the following may happen:
* Full GC (System.gc())
* java.lang.OutOfMemoryError: Direct buffer memory
* kernel killing the process (OOM-killer)
see issue #22723
* properly shutdown ArteryTransport using CoordinatedShutdown, #22671
* The shutdownHook changed hasBeenShutdown flag to true, and then when
the transport.shutdown was invoked the shutdown sequence was ignored
until it was too late, ActorSystem already terminated.
* Also improved the cluster shutdown tasks when the cluster node had not
joined
* CoordinatedShutdownLeave explicit events
* We used the Array based toBinary but the ByteBuffer based fromBinary.
and IntSerializer is only using the same format for those when the byte order
is LITTLE_ENDIAN, which we didn't get from protbuf's asReadOnlyByteBuffer
* We can use the Array based methods in DaemonMsgCreateSerializer,
performance is not important here
* Added some more testing in PrimitivesSerializationSpec
An alternative way of reporting might be to make the error part of the
DisassociationInfo. This would require changing or adding another subclass
which is a non-compatible change. Could still be worthwhile to do to prevent
double logging.
* e.g. the jvm shutdown hook should be installed immediately
* noticed that it was initialized from artery shutdown
* run-by-jvm-shutdown-hook=off in multi-jvm tests
* CoordinatedShutdown that can run tasks for configured phases in order (DAG)
* coordinate handover/shutdown of singleton with cluster exiting/shutdown
* phase config obj with depends-on list
* integrate graceful leaving of sharding in coordinated shutdown
* add timeout and recover
* add some missing artery ports to tests
* leave via CoordinatedShutdown.run
* optionally exit-jvm in last phase
* run via jvm shutdown hook
* send ExitingConfirmed to leader before shutdown of Exiting
to not have to wait for failure detector to mark it as
unreachable before removing
* the unreachable signal is still kept as a safe guard if
message is lost or leader dies
* PhaseClusterExiting vs MemberExited in ClusterSingletonManager
* terminate ActorSystem when cluster shutdown (via Down)
* add more predefined and custom phases
* reference documentation
* migration guide
* problem when the leader order was sys2, sys1, sys3,
then sys3 could not perform it's duties and move Leving sys1 to
Exiting because it was observing sys1 as unreachable
* exclude Leaving with exitingConfirmed from convergence condidtion
* WIP early preview of moving compressions ownership to Decoder
* Compression table created in transport, but owned by Decoder
Added test for restart of inbound stream
* =art snapshot not needed in HeavyHitters since owned by Decoder
Benchmarks revealed that busy spinning directly in the graph stage can
lead to an excessive increase in latency when multiple inbound lanes are
active (i.e. the inbound flow has an asynchronous boundary driving the
multiple lanes).
The new strategy is therefore:
For inbound-lanes > 1 or idle-cpu-level < 5: no spinning in the graph stage
For inbound-lanes = 1 and idle-cpu-level >= 6: 50 * settings.Advanced.IdleCpuLevel - 240
which means in general much less or no spinning at all.
Fixes#21365.
It was reported that shared media driver performance can depend on the
kind of file-system where the files are contained. /dev/shm is an in-memory
filesystem that was reported to work well with the shared aeron media driver.
* to be able to introduce new messages and still support rolling upgrades,
i.e. a cluster of mixed versions
* note that it's only catching NotSerializableException, which we already
use for unknown serializer ids and class manifests
* note that it is not catching for system messages, since that could result
in infinite resending
* speedup ActorCreationPerfSpec
* reduce iterations in ConsistencySpec
* tag SupervisorHierarchySpec as LongRunningTest
* various small speedups and tagging in actor-tests
* speedup expectNoMsg in stream-tests
* tag FramingSpec, and reduce iterations
* speedup QueueSourceSpec
* tag some stream-tests
* reduce iterations in persistence.PerformanceSpec
* reduce iterations in some cluster perf tests
* tag RemoteWatcherSpec
* tag InterpreterStressSpec
* remove LongRunning from ClusterConsistentHashingRouterSpec
* sys property to disable multi-jvm tests in test
* actually disable multi-node tests in validatePullRequest
* doc sbt flags in CONTRIBUTING