Commit graph

67 commits

Author SHA1 Message Date
Patrik Nordwall
8b4e903e7d Detect failure when no heartbeats sent, see #2907
* Subscribe to InstantMemberEvent and start heartbeating when
  InstantMemberUp. Same for metrics.
* HeartbeatNodeRing data structure for bidirectional mapping of
  heartbeat sender and receiver. Not using ConsistentHash anymore.
  Node addresses are hashed to ensure that neighbors are spread out.
* HeartbeatRequest when receiver detects that it has not received
  expected heartbeats.
* New test InitialHeartbeatSpec that simulates the problem
* Add/remove some related conf properties
* Add some more logging to be able to diagnose eventual problems
* Explicit config of nr-of-end-heartbeats
2013-01-18 12:54:09 +01:00
Viktor Klang
adfeb2c1f0 #2879 - updating copyright info 2013-01-09 11:38:00 +01:00
Patrik Nordwall
f147f4d3d2 Stress / long running test of cluster, see #2786
* akka.cluster.StressSpec
* Configurable number of nodes and duration for each step
* Report metrics and phi periodically to see progress
* Configurable payload size
* Test of various join and remove scenarios
* Test of watch
* Exercise supervision
* Report cluster stats
* Test with many actors in tree structure

Apart from the test this commit also solves some issues:

* Avoid adding back members when downed in ClusterHeartbeatSender
* Avoid duplicate close of ClusterReadView
* Add back the publish of AddressTerminated when MemberDowned/Removed
  it was lost in merge of "publish on convergence", see #2779
2013-01-07 14:44:36 +01:00
Björn Antonsson
a03460329d Change cluster MemberEvents to only be published on convergence. See #2692
Conflicts:
	akka-cluster/src/main/scala/akka/cluster/ClusterEvent.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterJmx.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterReadView.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MultiNodeClusterSpec.scala
	akka-docs/rst/cluster/cluster-usage-java.rst
	akka-docs/rst/cluster/cluster-usage-scala.rst
	akka-kernel/src/main/dist/bin/akka-cluster
2012-12-14 12:46:13 +01:00
Patrik Nordwall
503e992d44 Await leader in awaitUpConvergence, see #2752
It's a glitch in how ClusterReadView (used in tests) is updated.
First we do awaitUpConvergence, which checks readView.members and
readView.convergence. Then we do assertLeader, which checks
readView.leader. The problem is that readView.leader might not
have been updated yet (if there already was convergence).
Solution is to await the expected leader in awaitUpConvergence,
so that readView is in a consistent state after awaitUpConvergence.

We will change the semantics in ticket #2692, but this change will
be needed and should work with that as well.
2012-12-04 17:07:08 +01:00
Patrik Nordwall
1914be7069 Merge branch 'master' into wip-2547-metrics-router-patriknw
Conflicts:
	akka-actor/src/main/scala/akka/actor/Deployer.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala
	akka-cluster/src/test/scala/akka/cluster/MetricsCollectorSpec.scala
2012-11-15 12:33:11 +01:00
Viktor Klang
8f131c680f Switching to immutable.Seq instead of Seq 2012-11-12 14:17:47 +01:00
Patrik Nordwall
c9d206764a ClusterLoadBalancingRouter and refactoring of metrics, see #2547
* MetricsSelector, calculate capacity, weights and allocate weighted
  routee refs
* ClusterLoadBalancingRouterSpec
* Optional heap max
* Constants for the metric fields
* Refactoring of Metric and decay
* Rewrite of DataStreamSpec
* Correction of EWMA and removal of BigInt, BigDecimal
* Separation of MetricsCollector into trait and two classes,
  SigarMetricsCollector and JmxMetricsCollector
* This will reduce cost when sigar is not installed, such as
  avoiding throwing and catching exc for every call
* Improved error handling for loading sigar
* Made MetricsCollector implementation configurable
* Tested with sigar
2012-11-07 20:36:24 +01:00
Björn Antonsson
977194ff8e Improve MultiNodeSpec ifNode syntax. #2126 2012-11-01 15:16:02 +01:00
Patrik Nordwall
b52a082279 Even better assert message in case of failure of assertLeader, see #2641 2012-10-19 17:28:20 +02:00
Patrik Nordwall
2999e9a43b Better assert message in case of failure of assertLeader, see #2641 2012-10-19 17:03:45 +02:00
Roland
0f04239f67 move Duration classes according to scala 2.10 nightly and remove casts to FiniteDuration, see #2504 2012-10-11 15:18:10 -07:00
Patrik Nordwall
8476b2195c Mute expected log messages, see #2010 2012-10-01 20:12:36 +02:00
Patrik Nordwall
88a9123724 Move heartbeat-interval into failure-detector section 2012-09-20 15:23:18 +02:00
Patrik Nordwall
9423d37da9 Merge branch 'master' into wip-cluster-docs-patriknw
Conflicts:
	project/AkkaBuild.scala
2012-09-20 10:40:08 +02:00
Patrik Nordwall
068335789c Cluster config setting to disable jmx, see #2531 2012-09-20 08:09:01 +02:00
Roland
35b7a9e338 second round of FiniteDuration business, including cluster fixes
- make Scheduler only accept FiniteDuration, which has quite some
  knock-on effects
2012-09-18 09:58:30 +02:00
Björn Antonsson
afe30e9038 Removed all dependencies to ScalaTest in the published artifacts. See #1802 2012-09-12 15:12:13 +02:00
Patrik Nordwall
018a949678 Merge branch 'master' into wip-2103-cluster-routers-patriknw
Conflicts:
	akka-cluster/src/multi-jvm/scala/akka/cluster/ClientDowningNodeThatIsUnreachableSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/ClientDowningNodeThatIsUpSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/ConvergenceSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/LeaderDowningNodeThatIsUnreachableSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/LeaderElectionSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MultiNodeClusterSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/SingletonClusterSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/SplitBrainSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/UnreachableNodeRejoinsClusterSpec.scala
	akka-cluster/src/test/scala/akka/cluster/ClusterSpec.scala
	akka-remote/src/main/scala/akka/routing/RemoteRouterConfig.scala
2012-09-11 10:55:47 +02:00
Patrik Nordwall
bd6c39178c Fix leaking this in constructor of Cluster, see #2473
* Major refactoring to remove the need to use special
  Cluster instance for testing. Use default Cluster
  extension instead. Most of it is trivial changes.
* Used failure-detector.implementation-class from config
  to swap to Puppet
* Removed FailureDetectorStrategy, since it doesn't add any value
* Added Cluster.joinSeedNodes to be able to test seedNodes when Addresses
  are unknown before startup time.
* Removed ClusterEnvironment that was passed around among the actors,
  instead they use the ordinary Cluster extension.
* Overall much cleaner design
2012-09-06 21:48:40 +02:00
Patrik Nordwall
f4cc8f8649 Make it more intuitive for tests using real Cluster(system) extension, see #2103
* We will write more tests that rely on real Cluster(system) extension,
  such as ClusterRoundRobinRoutedActorSpec
* When not using FailureDetectorStrategy or overriding seed nodes
  MultiNodeClusterSpec will use the real Cluster(system) extension
  instead of a new Cluster instance with additional test facilities
2012-08-28 16:38:05 +02:00
Patrik Nordwall
417bdc2dfb Prototype of cluster aware routers, see #2103
* Several FIXME that needs to be discussed
* ClusterRouterConfig created via ClusterActorRefProvider
2012-08-28 11:54:52 +02:00
Patrik Nordwall
1700edb863 Merge branch 'master' into wip-2202-cluster-domain-events-patriknw
Conflicts:
	akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala
2012-08-16 18:54:10 +02:00
Patrik Nordwall
4e2d7b0495 Move Cluster query methods to ClusterReadView, see #2202
* Better separation of concerns
* Internal API (could be made public if requested)
2012-08-16 18:28:01 +02:00
Patrik Nordwall
06f81f4373 Improve publish of domain events, see #2202
* Gossip is not exposed in user api
* Better and more events
* Snapshot event sent to new subscriber
* Updated tests
* Periodic publish only for internal stats
2012-08-15 16:47:34 +02:00
Patrik Nordwall
d7b0089d7e Support concurrent startup of seed nodes, see #2270
* Implemented the startup sequence of seed nodes as
  described in #2305
* Test that verifies concurrent startup of seed nodes
2012-08-14 15:16:23 +02:00
Viktor Klang
1114da2198 Importing language features used by akka-cluster and akka-remote-tests 2012-07-26 14:47:21 +02:00
Viktor Klang
ac5b5de90a Merging in master, huge work trying to get things to compile, tests not green at this stage 2012-07-06 17:04:04 +02:00
Patrik Nordwall
c708d2ad8a First step in refactoring of cluster internals to actors, see #2311
* Move clustering code to ClusterCore actor
* More will be done, comitting this for early review
2012-07-04 13:52:14 +02:00
Viktor Klang
54a3a44bf8 #2292 - Removing akka.util.Duration etc and replace it with scala.concurrent.util.Duration 2012-06-29 13:33:20 +02:00
Björn Antonsson
db1175e1f3 Bringing UnreachableNodeRejoinsClusterSpec up to speed with master 2012-06-29 13:24:46 +02:00
Björn Antonsson
2ea0bba9e9 Cluster node that is UNREACHABLE and rejoins. see #2160 2012-06-29 13:24:46 +02:00
Patrik Nordwall
932ea6f98a Test split brain scenario, see #2265 2012-06-26 18:25:42 +02:00
Patrik Nordwall
25996bf284 Join seed nodes before becoming singleton cluster, see #2267
* self is initially not member (in gossip state)
* if the join to seed nodes timeout it joins itself, and becomes
  singleton cluster
* remove the special case handling of singelton cluster in gossip
  merge, since singleton cluster is not the normal state when joining
  any more
2012-06-25 21:34:14 +02:00
Patrik Nordwall
42078e7083 Reintroduce 'seed' nodes, see #2219
* Implement the join to seed nodes process
  When a new node is started started it sends a message to all
  seed nodes and then sends join command to the one that answers
  first.
* Configuration of seed-nodes and auto-join
* New JoinSeedNodeSpec that verifies the auto join to seed nodes
* In tests seed nodes are configured by overriding seedNodes
  function, since addresses are not known before start
* Deputy nodes are the live members of the seed nodes (not sure if
  that will be the final solution, see ticket 2252
* Updated cluster.rst with latest info about deputy and seed nodes
2012-06-21 11:05:02 +02:00
Patrik Nordwall
ec9abb12df Merge branch 'master' into wip-2201-cache-node-lookup-patriknw
Conflicts:
	akka-cluster/src/multi-jvm/scala/akka/cluster/ConvergenceSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/LeaderElectionSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MembershipChangeListenerExitingSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MembershipChangeListenerJoinSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MembershipChangeListenerLeavingSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MembershipChangeListenerUpSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/NodeLeavingAndExitingAndBeingRemovedSpec.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/TransitionSpec.scala
2012-06-20 09:40:16 +02:00
Björn Antonsson
4a56f195fc Merge branch 'master' into wip-2218-test-conductor-barrier-timeouts 2012-06-19 15:11:50 +02:00
Patrik Nordwall
12e90a98dc Remove address vals in tests, fix race in TransitionSpec, see #2201 2012-06-18 13:24:46 +02:00
Patrik Nordwall
fb04786072 Cleanup based on review comments, see #2201 2012-06-18 11:16:30 +02:00
Patrik Nordwall
aae8869390 Merge branch 'master' into wip-2201-cache-node-lookup-patriknw
Conflicts:
	akka-cluster/src/multi-jvm/scala/akka/cluster/MultiNodeClusterSpec.scala
2012-06-15 17:44:26 +02:00
Patrik Nordwall
f8b7189885 Place the address cache in MultiNodeClusterSpec, see #2201 2012-06-15 17:32:40 +02:00
Patrik Nordwall
404fa4dfa3 Implicit conversion from RoleName to Address
* Improved readability in tests if role name can be used
* Will change other tests also if you like it, otherwise revert
2012-06-15 14:45:41 +02:00
Björn Antonsson
fd42c3d49a Allow barrier timeouts to be shortened and other review fixes 2012-06-15 14:39:47 +02:00
Patrik Nordwall
11c85b84b9 Fail fast in cluster tests if prevous step failed 2012-06-15 13:35:52 +02:00
Björn Antonsson
5714d8327f Make multi node tests use the within() aware barrier 2012-06-13 14:55:33 +02:00
Patrik Nordwall
a7d2be10eb Merge branch 'master' into wip-2214-heartbeats-patriknw
Conflicts:
	akka-cluster/src/main/scala/akka/cluster/AccrualFailureDetector.scala
	akka-cluster/src/main/scala/akka/cluster/Cluster.scala
2012-06-11 22:27:08 +02:00
Patrik Nordwall
e2551494c4 Use Use separate heartbeats for FailureDetector, see #2214
* Send Heartbeat message to all members at regular interval
* Removed the need to gossip to myself
2012-06-11 15:00:44 +02:00
Jonas Bonér
b65cf5c2ec Created FailureDetectorStrategy with two implementations: FailureDetectorPuppetStrategy and AccrualFailureDetectorStrategy.
- Created FailureDetectorStrategy base trait.
- Created FailureDetectorPuppetStrategy.
- Created AccrualFailureDetectorStrategy.
- Created two versions of LeaderDowningNodeThatIsUnreachableMultiJvmSpec
    - LeaderDowningNodeThatIsUnreachableWithFailureDetectorPuppet
    - LeaderDowningNodeThatIsUnreachableWithAccrualFailureDetector
- Added AccrualFailureDetectorStrategy to all the remaining tests - will be split up into two versions shortly.

Signed-off-by: Jonas Bonér <jonas@jonasboner.com>
2012-06-11 14:32:17 +02:00
Patrik Nordwall
56735477b8 initialParticipants default as roles.size in cluster tests 2012-06-08 09:23:36 +02:00
Patrik Nordwall
bc289df018 Unit tests of Cluster, see 2163
* ClusterSpec
- Test gossiping rules for deputies and unreachable
- Fix strange/wrong probabilites for gossip to unreachable and deputy nodes
- Fix lost order of Members when using map (without .toSeq) on the members SortedSet

* MemberSpec
- Test equals, hashCode

* GossipSpec
- Test member merge by status prio
- Fix bug in member merge (groupBy was wrong)
2012-06-07 08:29:07 +02:00