Commit graph

58 commits

Author SHA1 Message Date
Christopher Batey
009214ae07
Update copyright to 2018 (#24241) 2018-01-04 17:26:29 +00:00
Arnout Engelen
b1df13d4d4 Update scalariform (#23778) (#23783) 2017-10-06 10:30:28 +02:00
Konrad `ktoso` Malawski
b568975acc =clu #23229 multi-dc heartbeating, only N nodes perform monitoring 2017-07-07 12:17:41 +01:00
Philippus Baalman
6c7085252a extended copyright into 2017 2017-01-04 17:37:15 +01:00
Patrik Nordwall
1926560e41 stop outbound streams when quarantined, #21407
* they can't be stopped immediately because we want to send
  some final message and we reply to inbound messages with `Quarantined`
* and improve logging
2016-09-21 14:38:13 +02:00
Björn Antonsson
c66ce62d63 Update to a working version of Scalariform 2016-06-02 22:12:36 +02:00
Johannes Rudolph
b6cbc7f13a =all remove unused imports 2016-02-23 20:29:22 +01:00
Johan Andrén
62e30b3c08 Update copyrights and links to the new company name #19851 2016-02-23 12:58:39 +01:00
Prayag Verma
b7783968a0 =pro #19068 All copyrights ranges and single years updated to a range ending in 2016 2016-01-25 10:20:30 +01:00
Konrad Malawski
35108384c9 =clt #19381 silence heartbeat logging in cluster client 2016-01-08 12:11:56 +01:00
Patrik Nordwall
4d64901228 =clu #19274 failure detection of joining/down member status
* Failure detection heartbeating was not performed to joining
  nodes, since it was expected that they will become Up first.
* If a joining node is downed before it is changed to Up failure
  detection will not be performed for that node. That resulted in
  the downed node will not be removed from membership, since the
  unreachability signal is used as confirmation that the node is
  actually stopped before removing it.
2015-12-26 11:30:18 +01:00
Patrik Nordwall
c7c187f6b7 =clu replace Set -- with diff and ++ with union
* better performance according to
  https://docs.google.com/presentation/d/1Qjryxoe-fYEM8ZPhM-98LKfbhnRcn5eAEMNlVVnixsA/pub
2015-11-06 14:48:17 +01:00
Veiga Ortiz, Héctor
c08bc317e2 +clu #13584 Accept joining to be WeaklyUp during network split
* experimental feature, disabled by default
* Adding documentation to mention weakly up members.
  plus adding new diagram.
2015-09-04 12:44:47 +02:00
Patrik Nordwall
737a50ebf3 =clu #17253 Improve cluster startup thread usage
When using a dispatcher (default or separate cluster dispatcher)
with less than 5 threads the Cluster extension initialization
could deadlock.

It was reproducable by adding a sleep before the Await of GetClusterCoreRef
in the Cluster extension constructor. The reason was that other cluster actors were
started too early and they also tried to get the Cluster extension and thereby blocking
dispatcher threads.

Note that the Cluster extension is started via ClusterActorRefProvider before
ActorSystem.apply returns.

The improvement is to start the cluster child actors lazily when the
GetClusterCoreRef is received.
2015-09-03 18:09:31 +02:00
Patrik Nordwall
c1d14f1887 = #17240 Use some more DeadLetterSuppression
* sysmsg.Terminate, sysmsg.DeathWatchNotification, io.Tcp.Closed
  were needed to silence normal usage of http client/server
* other things based on jenkins logs, but not a complete audit

(cherry picked from commit 270e3b2f49af3c34fd5ea4c3bcfd8257402b5cbe)
2015-04-22 21:40:23 +02:00
Julian Tescher
00f6a58e7c Changes all occurances of Typesafe copyright to extend to 2015 2015-03-10 14:12:19 -07:00
Patrik Nordwall
bc87dcad78 =clu #16624 Expand the monitored nodes when unreachable
* Otherwise the leader might stall (cannot remove downed nodes)
  if many nodes are shutdown at the same time and nobody in the
  remaining cluster is monitoring some of the shutdown nodes.

(cherry picked from commit 1354524c4fde6f40499833bdd4c0edd479e6f906)

Conflicts:
	akka-cluster/src/main/scala/akka/cluster/ClusterHeartbeat.scala
	project/AkkaBuild.scala
2015-01-19 13:22:46 +01:00
Patrik Nordwall
a97de7db90 =rem #13960 #13989 #13742 #13985 Optimize EndpointWriter
* Replace stash with internal bufferi, j.u.LinkedList
* Replace FSM with become
* Adaptive backoff, important to backoff, but not for too long,
  depends on environment and use case
* Prioritize heartbeat messages from remote watcher and cluster
  failure detector
* Use payload messages as heartbeats for transport failure detector,
  change transport failure detector to be based on absolute timeout,
  see ticket #13989 and #13742
* Log remote disassociate from transport failure detector,
  see ticket #13985
* Add benchmark sample in akka-sample-remote-scala
2014-04-24 13:39:58 +02:00
dario.rexin
2cbad298d6 =all #3858 Make case classes final 2014-03-07 13:20:01 +01:00
Adam Voss
cce29dfa51 Changes all occurances of Typesafe copyright to extend to 2014. 2014-02-04 21:20:09 -06:00
Patrik Nordwall
a11fb1dafc =act #3572 Add parens to sender
* because it is not referentially transparent; normally we reserved parens for
  side-effecting code but given how people thoughtlessly close over it we revised
  that that decision for sender
* caller can still omit parens
2014-01-17 18:21:14 +01:00
Patrik Nordwall
eaad7ecf7e !clu #3683 Change cluster heartbeat to req/rsp protocol
* The previous one-way hearbeat was elegant, but comlicated to
  understand and without giving much extra value compared to this approach.
* The previous one-way heartbeat have some kind of bug when joining
  several (10-20) nodes at approximately the same time (but not exactly
  the same time) with a false failure detection triggered by the extra heartbeat,
  which would not heal.
* This ping-pong approach will increase network traffic slightly, but heartbeat
  messages are small and each node is limited to monitor (default) 5 peers.
2013-11-15 08:18:52 +01:00
Patrik Nordwall
3bdac872ff =clu #3683 Don't trigger extra heartbeat when not expected sender 2013-10-22 14:48:37 +02:00
Patrik Nordwall
16e0388fe0 #3655 clu Stop sending EndHeartbeat after ack 2013-10-10 11:14:21 +02:00
Patrik Nordwall
2a7d12701b Merge pull request #1726 from akka/wip-3498-optimize-heartbeat-sender-patriknw
=clu #3498 Optimize ClusterHeartbeatSender
2013-09-17 01:35:53 -07:00
Patrik Nordwall
cc66daad48 =clu #3600 Disable Cluster assertInvariants checks by default
* asserts can be enabled by system property 'akka.cluster.assert=on'
2013-09-11 15:11:01 +02:00
Patrik Nordwall
b6c63a8fb3 =clu #3498 Optimize ClusterHeartbeatSender 2013-09-11 14:46:08 +02:00
Patrik Nordwall
dc9fe4f19c !clu #2307 Allow transition from unreachable to reachable
* Replace unreachable Set with Reachability table
* Unreachable members stay in member Set
* Downing a live member was moved it to the unreachable Set,
  and then removed from there by the leader. That will not
  work when flipping back to reachable, so a Down member must
  be detected as unreachable before beeing removed. Similar
  to Exiting. Member shuts down itself if it sees itself as
  Down.
* Flip back to reachable when failure detector monitors it as
  available again
* ReachableMember event
* Can't ignore gossip from aggregated unreachable (see SurviveNetworkInstabilitySpec)
* Make use of ReachableMember event in cluster router
* End heartbeat when acknowledged, EndHeartbeatAck
* Remove nr-of-end-heartbeats from conf
* Full reachability info in JMX cluster status
* Don't use interval after unreachable for AccrualFailureDetector history
* Add QuarantinedEvent to remoting, used for Reachability.Terminated
* Prune reachability table when all reachable
* Update documentation
* Performance testing and optimizations
2013-09-11 13:10:29 +02:00
Patrik Nordwall
196a141976 FIXME in cluster, see #3192 2013-05-28 09:02:03 +02:00
Patrik Nordwall
3736efb79a Merge pull request #1472 from akka/wip-3225-cluster-infolog-patriknw
Config of cluster info logging, see #3225
2013-05-27 00:13:08 -07:00
Patrik Nordwall
18a3b3facf Config of cluster info logging, see #3225 2013-05-23 13:36:35 +02:00
Patrik Nordwall
ee6e80d31a Add previousStatus in MemberRemoved, see #3252 2013-05-23 11:09:32 +02:00
Patrik Nordwall
a0a0f39613 Hardening of cluster member leaving path, see #3309
* Removed leader commands for Shutdown and Exit
* Member shutdown itself  when it sees itself as Exiting
* Singleton cluster with status Exiting will shutdown itself,
  in case the Exiting gossip never arrives
* Exiting member not part convergence check
* Exiting member is removed by leader (on convergence) when the
  exiting member is in the unreachable set, i.e. sucessfully shutdown
* Reverted the change made for #3266, i.e. Exiting is
  detected as unreachable again.
* Adjust ClusterSingletonManager to new Exiting behaviour
* Fix bug in HeartbeatSender, which caused it to continue to
  send heartbeats to removed nodes, instead of rebalancing
* Refactoring of leaderActions method
* Leaving section in docs
2013-05-17 11:39:49 +02:00
Patrik Nordwall
887af975ae Deprecate actorFor in favor of ActorSelection, see #3074
* Deprecate all actorFor methods
* resolveActorRef in provider
* Identify auto receive message
* Support ActorPath in actorSelection
* Support remote actor selections
* Additional tests of actor selection
* Update tests (keep most actorFor tests)
* Update samples to use actorSelection
* Updates to documentation
* Migration guide, including motivation
2013-04-08 18:11:52 +02:00
Viktor Klang
c883705242 #3018 - Enabling -Xlint and dealing with the situation that occurs 2013-03-29 01:43:17 +01:00
Patrik Nordwall
5b844ec1e6 Publish member events when state change first seen, see #3075
* Remove InstantMemberEvent
2013-03-07 14:07:17 +01:00
Patrik Nordwall
9dc124dacd Remove work-around for sending to broken connections, see #2909
* Previous work-around was introduced because Netty blocks when sending
to broken connections. This is supposed to be solved by the non-blocking
new remoting.
* Removed HeartbeatSender and CoreSender in cluster
* Added tests to verify that broken connections don't disturb live connection
2013-01-31 13:41:02 +01:00
Patrik Nordwall
5dc108567d Style change of def starting with if
* When a def starts with if and is not a oneliner the if
  should be on a new line.
* The reason is that it might be easy to miss the if when
  reading the code.
2013-01-18 13:28:49 +01:00
Patrik Nordwall
8b4e903e7d Detect failure when no heartbeats sent, see #2907
* Subscribe to InstantMemberEvent and start heartbeating when
  InstantMemberUp. Same for metrics.
* HeartbeatNodeRing data structure for bidirectional mapping of
  heartbeat sender and receiver. Not using ConsistentHash anymore.
  Node addresses are hashed to ensure that neighbors are spread out.
* HeartbeatRequest when receiver detects that it has not received
  expected heartbeats.
* New test InitialHeartbeatSpec that simulates the problem
* Add/remove some related conf properties
* Add some more logging to be able to diagnose eventual problems
* Explicit config of nr-of-end-heartbeats
2013-01-18 12:54:09 +01:00
Viktor Klang
adfeb2c1f0 #2879 - updating copyright info 2013-01-09 11:38:00 +01:00
Patrik Nordwall
f147f4d3d2 Stress / long running test of cluster, see #2786
* akka.cluster.StressSpec
* Configurable number of nodes and duration for each step
* Report metrics and phi periodically to see progress
* Configurable payload size
* Test of various join and remove scenarios
* Test of watch
* Exercise supervision
* Report cluster stats
* Test with many actors in tree structure

Apart from the test this commit also solves some issues:

* Avoid adding back members when downed in ClusterHeartbeatSender
* Avoid duplicate close of ClusterReadView
* Add back the publish of AddressTerminated when MemberDowned/Removed
  it was lost in merge of "publish on convergence", see #2779
2013-01-07 14:44:36 +01:00
Björn Antonsson
a03460329d Change cluster MemberEvents to only be published on convergence. See #2692
Conflicts:
	akka-cluster/src/main/scala/akka/cluster/ClusterEvent.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterJmx.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterReadView.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MultiNodeClusterSpec.scala
	akka-docs/rst/cluster/cluster-usage-java.rst
	akka-docs/rst/cluster/cluster-usage-scala.rst
	akka-kernel/src/main/dist/bin/akka-cluster
2012-12-14 12:46:13 +01:00
Patrik Nordwall
1914be7069 Merge branch 'master' into wip-2547-metrics-router-patriknw
Conflicts:
	akka-actor/src/main/scala/akka/actor/Deployer.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala
	akka-cluster/src/test/scala/akka/cluster/MetricsCollectorSpec.scala
2012-11-15 12:33:11 +01:00
Roland
b96d77c15a actually build the samples
and fix the resulting breakage (of not compiling them for some time,
that is)

also remove the last casts to FiniteDuration
2012-10-15 17:17:54 +02:00
Roland
bff79c2f94 Merge remote-tracking branch 'origin/master' into wip-2.10.0-RC1-∂π
- currently cheating: uses zeroMQ artifacts for scala 2.10M7
- fixed a bunch of more wrong references to scala.concurrent.util
2012-10-15 16:18:52 +02:00
Roland
0f04239f67 move Duration classes according to scala 2.10 nightly and remove casts to FiniteDuration, see #2504 2012-10-11 15:18:10 -07:00
Patrik Nordwall
279fd2b6ef Fix bug introduced in refactoring, see #2284 2012-10-10 18:13:08 +02:00
Patrik Nordwall
66c81e915e Move state of ClusterHeartbeatSender to separate immutable class, see #2284 2012-10-10 15:23:18 +02:00
Patrik Nordwall
668d5a5013 Merge branch 'master' into wip-2284-heartbeat-scalability-patriknw
Conflicts:
	akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala
2012-10-09 18:11:36 +02:00
Patrik Nordwall
59f8210b85 Incorporate review comments, see #2284 2012-10-09 17:54:54 +02:00