Commit graph

361 commits

Author SHA1 Message Date
Patrik Nordwall
2476831705 Rename event-handlers to loggers, see #2979
* Rename config akka.event-handlers to akka.loggers
* Rename config akka.event-handler-startup-timeout to
  akka.logger-startup-timeout
* Rename JulEventHandler to JavaLogger
* Rename Slf4jEventHandler to Slf4jLogger
* Change all places in tests and docs
* Deprecation, old still works, but with warnings
* Migration guide
* Test for the deprecated event-handler config
2013-02-05 11:19:02 +01:00
Patrik Nordwall
157a25bcde Failure detector refactoring, see #2690
* Failure detector was previously copied with refactoring to
  akka-remote and this refactoring makes use of that and removes
  the failure detector in akka-cluster
* Adjustments to reference.conf
* Refactoring of FailureDetectorPuppet
2013-02-01 10:08:39 +01:00
Endre Sándor Varga
e0a9dd70ba Dead letters containing remote envelopes handled correctly #2959
- New DeadLetter class for handling remoting specific envelopes
 - Fixed error handling of name lookups
 - Name lookup is now handled via futures (future refactor opportunity)
2013-01-29 11:31:53 +01:00
Endre Sándor Varga
99adbdfab4 Changed and documented new remoting configuration #2593 2013-01-24 12:35:05 +01:00
Patrik Nordwall
5dc108567d Style change of def starting with if
* When a def starts with if and is not a oneliner the if
  should be on a new line.
* The reason is that it might be easy to miss the if when
  reading the code.
2013-01-18 13:28:49 +01:00
Patrik Nordwall
bdd69f7cdd Unique barriers for each round in StressSpec.exerciseSupervision, see #2905
* Looks like the issue was caused by a mixup of barriers and/or name
  of the result aggregator actor
2013-01-18 13:13:38 +01:00
Patrik Nordwall
8b4e903e7d Detect failure when no heartbeats sent, see #2907
* Subscribe to InstantMemberEvent and start heartbeating when
  InstantMemberUp. Same for metrics.
* HeartbeatNodeRing data structure for bidirectional mapping of
  heartbeat sender and receiver. Not using ConsistentHash anymore.
  Node addresses are hashed to ensure that neighbors are spread out.
* HeartbeatRequest when receiver detects that it has not received
  expected heartbeats.
* New test InitialHeartbeatSpec that simulates the problem
* Add/remove some related conf properties
* Add some more logging to be able to diagnose eventual problems
* Explicit config of nr-of-end-heartbeats
2013-01-18 12:54:09 +01:00
Viktor Klang
adfeb2c1f0 #2879 - updating copyright info 2013-01-09 11:38:00 +01:00
Patrik Nordwall
46d376b3e5 Remove LargeClusterSpec, superseded by StressSpec, see #2786 2013-01-08 15:09:51 +01:00
Patrik Nordwall
f147f4d3d2 Stress / long running test of cluster, see #2786
* akka.cluster.StressSpec
* Configurable number of nodes and duration for each step
* Report metrics and phi periodically to see progress
* Configurable payload size
* Test of various join and remove scenarios
* Test of watch
* Exercise supervision
* Report cluster stats
* Test with many actors in tree structure

Apart from the test this commit also solves some issues:

* Avoid adding back members when downed in ClusterHeartbeatSender
* Avoid duplicate close of ClusterReadView
* Add back the publish of AddressTerminated when MemberDowned/Removed
  it was lost in merge of "publish on convergence", see #2779
2013-01-07 14:44:36 +01:00
Patrik Nordwall
6fae695b3c Enable blackhole tests again, see #2832 2013-01-04 13:13:28 +01:00
Roland
6c31d5313e rename AkkaSpec.{atTermination => afterTermination} 2013-01-03 17:17:12 +01:00
Patrik Nordwall
0d185e297d Check routees in a better way in ClusterRoundRobinRoutedActorSpec, see #2801 2012-12-21 15:03:29 +01:00
Björn Antonsson
ff852984cb Marking blackhole tests as ignored. See #2731 2012-12-19 17:50:50 +01:00
Endre Sándor Varga
3bfef958d4 Reenabled ignored cluster tests 2012-12-18 16:08:22 +01:00
Endre Sándor Varga
55be17419e Merge branch 'master' into wip-2053d-actorbased-remote-drewhk
Conflicts:
	akka-docs/rst/java/code/docs/serialization/SerializationDocTestBase.java
	akka-docs/rst/scala/code/docs/serialization/SerializationDocSpec.scala
	akka-remote-tests/src/main/scala/akka/remote/testconductor/NetworkFailureInjector.scala
	akka-remote/src/main/scala/akka/remote/RemoteActorRefProvider.scala
2012-12-18 15:15:01 +01:00
Endre Sándor Varga
a7b78bf78b Integration with the TestConductor
- Removed old FailureInjector from TestConductor
- Fixed tests to work with the new remoting
- Plugged ThrottlerTransportAdapter into TestConductor
2012-12-18 14:26:53 +01:00
Björn Antonsson
a03460329d Change cluster MemberEvents to only be published on convergence. See #2692
Conflicts:
	akka-cluster/src/main/scala/akka/cluster/ClusterEvent.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterJmx.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterReadView.scala
	akka-cluster/src/multi-jvm/scala/akka/cluster/MultiNodeClusterSpec.scala
	akka-docs/rst/cluster/cluster-usage-java.rst
	akka-docs/rst/cluster/cluster-usage-scala.rst
	akka-kernel/src/main/dist/bin/akka-cluster
2012-12-14 12:46:13 +01:00
Patrik Nordwall
0c9ad2f791 Merge pull request #938 from akka/wip-2779-down-terminated-patriknw
Publish AddressTerminated after a node is Down, not when unreachable, see #2779
2012-12-14 01:38:25 -08:00
Patrik Nordwall
44ab9f116f min-nr-of-members and registerOnMemberUp, see #2306
* Leader moves joining members to up when min-nr-of-members reached
* Tested by MinMembersBeforeUpSpec
* Used in factorial sample
* Docs
2012-12-12 14:00:06 +01:00
Patrik Nordwall
1cd3a05f41 Publish AddressTerminated after a member is Downed/Removed, see #2779
* Instead of when unreachable

* Note that ClusterRouterConfig is not changed, i.e. routees will be removed
  when unreachable
* Routers that are not wrapped by ClusterRouterConfig will watch as usual, i.e.
  remove routees when Terminated, i.e. node down
2012-12-12 12:55:22 +01:00
Patrik Nordwall
1df787d0c5 Incorporate review comments and cleanup isAvailable, see #2018
* Renamed isRunning to isTerminated (with negation of course)
* Removed Running from JMX API, since the mbean is deregistered anyway
* Cleanup isAvailable, isUnavailbe
* Misc minor
2012-12-06 15:26:57 +01:00
Patrik Nordwall
a7b7ab040d Tests for the Cluster JMX API, see #2018
* MBeanSpec
* Added Members and Unreachable to JMX API
* Removed Convergence from JMX API, because it will
  not be exposed when ticket #2692 is merged
* Updated documentation and akka-cluster script
2012-12-06 10:59:09 +01:00
Patrik Nordwall
503e992d44 Await leader in awaitUpConvergence, see #2752
It's a glitch in how ClusterReadView (used in tests) is updated.
First we do awaitUpConvergence, which checks readView.members and
readView.convergence. Then we do assertLeader, which checks
readView.leader. The problem is that readView.leader might not
have been updated yet (if there already was convergence).
Solution is to await the expected leader in awaitUpConvergence,
so that readView is in a consistent state after awaitUpConvergence.

We will change the semantics in ticket #2692, but this change will
be needed and should work with that as well.
2012-12-04 17:07:08 +01:00
Patrik Nordwall
4761feb071 Merge pull request #858 from akka/wip-2547-metrics-router-patriknw
AdaptiveLoadBalancingRouter and refactoring of metrics, see #2547
2012-11-30 23:37:30 -08:00
Björn Antonsson
b2522ba02d Mark tests that use unstable experimental features as ignored. See #2654
These tests use the throttling in the experimental test conductor which relies
on the fact that the same connection is used for both inbound and outbound
traffic. This is not always the case when starting multiple cluster nodes
at the same time.
2012-11-26 15:29:24 +01:00
Patrik Nordwall
cba535c9b7 Hardening AdaptiveLoadBalancingRouterSpec, see #2547
* Problems with OOME for large heaps
* Create many small arrays instead of one large array
* Reduce half-life to get faster updates
2012-11-23 11:22:01 +01:00
Patrik Nordwall
5eec693fd0 Incorparate review feedback, see #2547
* case object and case class for MixMetricsSelector
* Rename decay-half-life-duration to moving-average-half-life
* Clarification of decay-half-life-duration and collect-interval
* Removed Fields, Java compatibility issue
* Adapt for-yield variables
* Comment metrics collector constructor that takes system param
* Don't copy EWMA if not needed
* LogOf2 constant 0.69315
* Don't use mapValues
* Remove RichInt conversion
* sigar version replace tag in docs
* createDeployer factory method to make it possible to override
  deployer in subclass
* Improve readability of MetricsListener (in sample)
* Better startup of factorial sample (no sleep)
* Many minor enhancements and cleanups
2012-11-16 11:03:20 +01:00
Patrik Nordwall
1914be7069 Merge branch 'master' into wip-2547-metrics-router-patriknw
Conflicts:
	akka-actor/src/main/scala/akka/actor/Deployer.scala
	akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala
	akka-cluster/src/test/scala/akka/cluster/MetricsCollectorSpec.scala
2012-11-15 12:33:11 +01:00
Patrik Nordwall
dcde7d3594 AdaptiveLoadBalancingRouter and more refactoring of metrics, see #2547
* Refactoring of standard metrics extractors and data structures
* Removed optional value in Metric, simplified a lot
* Configuration of EWMA by using half-life duration
* Renamed DataStream to EWMA
* Incorporate review feedback
* Use binarySearch for selecting weighted routees
* More metrics selectors for the router
* Removed network metrics, since not supported on linux
* Configuration of router
* Rename to AdaptiveLoadBalancingRouter
* Remove total cores metrics, since it's the same as jmx getAvailableProcessors,
  tested on intel 24 core server and amd 48 core server, and MBP
* API cleanup
* Java API additions
* Documentation of metrics and AdaptiveLoadBalancingRouter
* New cluster sample to illustrate metrics in the documentation,
  and play around with (factorial)
2012-11-14 15:08:30 +01:00
Viktor Klang
8f131c680f Switching to immutable.Seq instead of Seq 2012-11-12 14:17:47 +01:00
Patrik Nordwall
c9d206764a ClusterLoadBalancingRouter and refactoring of metrics, see #2547
* MetricsSelector, calculate capacity, weights and allocate weighted
  routee refs
* ClusterLoadBalancingRouterSpec
* Optional heap max
* Constants for the metric fields
* Refactoring of Metric and decay
* Rewrite of DataStreamSpec
* Correction of EWMA and removal of BigInt, BigDecimal
* Separation of MetricsCollector into trait and two classes,
  SigarMetricsCollector and JmxMetricsCollector
* This will reduce cost when sigar is not installed, such as
  avoiding throwing and catching exc for every call
* Improved error handling for loading sigar
* Made MetricsCollector implementation configurable
* Tested with sigar
2012-11-07 20:36:24 +01:00
Helena Edelson
f306964fca Initial work of adaptive metrics aware routers, see #2547 2012-11-07 18:18:38 +01:00
Roland Kuhn
4acd483ed3 Merge pull request #814 from drewhk/wip-2282-closedchannelexception-drewhk
Disabled read timeout and suppressed ClosedChannelException Fixes#2632, ...
2012-11-06 04:17:07 -08:00
Björn Antonsson
977194ff8e Improve MultiNodeSpec ifNode syntax. #2126 2012-11-01 15:16:02 +01:00
Endre Sándor Varga
b2ba6d4702 Added remoting lifecycle event classes and event publisher 2012-10-26 09:04:43 +02:00
Patrik Nordwall
b52a082279 Even better assert message in case of failure of assertLeader, see #2641 2012-10-19 17:28:20 +02:00
Patrik Nordwall
2999e9a43b Better assert message in case of failure of assertLeader, see #2641 2012-10-19 17:03:45 +02:00
Patrik Nordwall
5e83df74e9 Solve wrong barrier problem, see #2583
* The problem was that we didn't wait for the testconductor.shutdown Future
  to complete and therefore barriers could be triggered in unexpected order.
  The reason why we didn't await, was that during shutdown the Future was
  completed with client disconnected failure. I have fixed that and added
  await to all shutdowns.
2012-10-16 17:02:13 +02:00
Roland
bff79c2f94 Merge remote-tracking branch 'origin/master' into wip-2.10.0-RC1-∂π
- currently cheating: uses zeroMQ artifacts for scala 2.10M7
- fixed a bunch of more wrong references to scala.concurrent.util
2012-10-15 16:18:52 +02:00
Patrik Nordwall
8dfb9434fa Merge pull request #787 from akka/wip-2284-heartbeat-scalability-patriknw
Use consistent hash to heartbeat to a few nodes instead of all, see #2284
2012-10-15 02:52:28 -07:00
Patrik Nordwall
91f6c5a94d Adjust barriers/checks in LeaderElectionSpec, see #2583
* Previously it didn't check for unreachable, before down
2012-10-12 13:27:59 +02:00
Roland
0f04239f67 move Duration classes according to scala 2.10 nightly and remove casts to FiniteDuration, see #2504 2012-10-11 15:18:10 -07:00
Patrik Nordwall
6f70624ddd Diagnostics for failing test, see #2583 2012-10-10 16:45:58 +02:00
Patrik Nordwall
668d5a5013 Merge branch 'master' into wip-2284-heartbeat-scalability-patriknw
Conflicts:
	akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala
2012-10-09 18:11:36 +02:00
Patrik Nordwall
39bb478b3f Adjust the failing assert in SingletonClusterSpec, see #2582 2012-10-08 16:45:42 +02:00
Patrik Nordwall
3f73705abc Use consistent hash to heartbeat to a few nodes instead of all, see #2284
* Previously heartbeat messages was sent to all other members, i.e.
  each member was monitored by all other members in the cluster.
* This was the number one know scalability bottleneck, due to the
  number of interconnections.
* Limit sending of heartbeats to a few (5) members. Select and
  re-balance with consistent hashing algorithm when new members
  are added or removed.
* Send a few EndHeartbeat when ending send of Heartbeat messages.
2012-10-08 08:41:28 +02:00
Patrik Nordwall
495ace37f4 Avoid TestConductorTransport unless needed, see #2586
* Due to the shutdown issues the TestConductorTransport is by
  default not active, but it's easy to activate it and exception
  will be thrown if trying to use the featues that require it, i.e
  blackhole, passThrow and throttle
* Documented
2012-10-05 14:52:18 +02:00
Patrik Nordwall
de420ec38a Merge branch 'master' into wip-2010-mute-log-patriknw 2012-10-02 10:40:30 +02:00
Patrik Nordwall
040d494119 Skip exception msg check of UnknownHost, see #2010 2012-10-02 10:38:56 +02:00