pekko

Author	SHA1	Message	Date
Björn Antonsson	5827a27b94	Make joining to the same node multiple times work, and reenable blackhole test. See #2930	2013-03-20 12:22:12 +01:00
Patrik Nordwall	7eac88f372	Cluster node roles, see #3049 * Config of node roles cluster.role * Cluster router configurable with use-role * RoleLeaderChanged event * Cluster singleton per role * Cluster only starts once all required per-role node counts are reached, role.<role-name>.min-nr-of-members config * Update documentation and make use of the roles in the examples	2013-03-18 11:56:11 +01:00
Viktor Klang (√)	05593f5dd8	Merge pull request #1230 from akka/wip-3076-gossip-merge-changes-ban Don't increment vector-clock on merge and merge locally. See #3076	2013-03-12 08:49:30 -07:00
Patrik Nordwall	d98a7ef1e8	Cluster singleton failure due to down-removed, see #3130 * The scenario was that previous leader left. * The problem was that the new leader got MemberRemoved before it got the HandOverDone and therefore missed the hand over data. * Solved by not changing the singleton to leader when receiving MemberRemoved and instead do that on normal HandOverDone or in failure cases after retry timeout. * The reason for this bug was the new transition from Down to Removed and that there is now no MemberDowned event. Previously this was only triggered by MemberDowned (not MemberRemoved) and that was safe because that was "always" preceeded by unreachable. * The new solution means that it will take longer for new singleton to startup in case of unreachable previous leader, but I don't want to trigger it on MemberUnreachable because it might in the future be possible to switch it back to reachable.	2013-03-11 12:37:35 +01:00
Björn Antonsson	7ed6b3d4ee	Fixes according review. See #3076	2013-03-11 12:27:29 +01:00
Patrik Nordwall	01bfb9378e	Logging of joining	2013-03-08 15:47:03 +01:00
Björn Antonsson	386bf87f0e	Don't increment vector-clock on merge and merge locally. See #3076	2013-03-08 12:14:25 +01:00
Patrik Nordwall	5c7747e7fa	Transition from Down to Removed, see #3075	2013-03-07 14:02:42 +01:00
Björn Antonsson	78c3ca359a	Fixes according to review. See #3115	2013-03-06 16:55:46 +01:00
Björn Antonsson	fad4289b1b	Merge gossip seen table when versions are the same. See #3115	2013-03-05 12:49:35 +01:00
Patrik Nordwall	679c4d313d	Support restart of first seed node, see #2854 * Try to first join other seed nodes before joining itself	2013-02-21 20:40:13 +01:00
Patrik Nordwall	b349ad8d87	Nodes not part of cluster have marked the Gossip as seen, see #3031 * Problem may occur when joining member with same hostname:port again, after downing. * Reproduced with StressSpec exerciseJoinRemove with fixed port that joins and shutdown several times. * Real solution for this will be covered by ticket #2788 by adding uid to member identifier, but as first step we need to support this scenario with current design. * Use unique node identifier for vector clock to avoid mixup of old and new member instance. * Support transition from Down to Joining in Gossip merge * Don't gossip to unknown or unreachable members.	2013-02-12 21:55:08 +01:00
Patrik Nordwall	cab78e5174	Make cluster fault handling more robust, see #3030 * ClusterCoreDaemon and ClusterDomainEventPublisher can't be restarted because the state would be obsolete. * Add extra supervisor level for ClusterCoreDaemon and ClusterDomainEventPublisher, which will shutdown the member on failure in children. * Publish the final removed state on postStop in ClusterDomainEventPublisher. This also simplifies the removing process.	2013-02-12 21:55:08 +01:00
Patrik Nordwall	9dc124dacd	Remove work-around for sending to broken connections, see #2909 * Previous work-around was introduced because Netty blocks when sending to broken connections. This is supposed to be solved by the non-blocking new remoting. * Removed HeartbeatSender and CoreSender in cluster * Added tests to verify that broken connections don't disturb live connection	2013-01-31 13:41:02 +01:00
Patrik Nordwall	5dc108567d	Style change of def starting with if * When a def starts with if and is not a oneliner the if should be on a new line. * The reason is that it might be easy to miss the if when reading the code.	2013-01-18 13:28:49 +01:00
Patrik Nordwall	8b4e903e7d	Detect failure when no heartbeats sent, see #2907 * Subscribe to InstantMemberEvent and start heartbeating when InstantMemberUp. Same for metrics. * HeartbeatNodeRing data structure for bidirectional mapping of heartbeat sender and receiver. Not using ConsistentHash anymore. Node addresses are hashed to ensure that neighbors are spread out. * HeartbeatRequest when receiver detects that it has not received expected heartbeats. * New test InitialHeartbeatSpec that simulates the problem * Add/remove some related conf properties * Add some more logging to be able to diagnose eventual problems * Explicit config of nr-of-end-heartbeats	2013-01-18 12:54:09 +01:00
Viktor Klang (√)	6b638db65e	Merge pull request #1006 from akka/wip-2879-copyright2013-√ #2879 - updating copyright info	2013-01-14 04:59:29 -08:00
Viktor Klang	adfeb2c1f0	#2879 - updating copyright info	2013-01-09 11:38:00 +01:00
Patrik Nordwall	943c438d5e	Publish clean state when joining (PublishStart), see #2871 * The failure in JoinTwoClustersSpec was due to missing publishing of cluster events when clearing current state when joining * This fix is in the right direction, but joining clusters like this will need some design thought, creating ticket 2873 for that	2013-01-08 19:32:36 +01:00
Björn Antonsson	a03460329d	Change cluster MemberEvents to only be published on convergence. See #2692 Conflicts: akka-cluster/src/main/scala/akka/cluster/ClusterEvent.scala akka-cluster/src/main/scala/akka/cluster/ClusterJmx.scala akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala akka-cluster/src/main/scala/akka/cluster/ClusterReadView.scala akka-cluster/src/multi-jvm/scala/akka/cluster/MultiNodeClusterSpec.scala akka-docs/rst/cluster/cluster-usage-java.rst akka-docs/rst/cluster/cluster-usage-scala.rst akka-kernel/src/main/dist/bin/akka-cluster	2012-12-14 12:46:13 +01:00
Patrik Nordwall	44ab9f116f	min-nr-of-members and registerOnMemberUp, see #2306 * Leader moves joining members to up when min-nr-of-members reached * Tested by MinMembersBeforeUpSpec * Used in factorial sample * Docs	2012-12-12 14:00:06 +01:00
Patrik Nordwall	1df787d0c5	Incorporate review comments and cleanup isAvailable, see #2018 * Renamed isRunning to isTerminated (with negation of course) * Removed Running from JMX API, since the mbean is deregistered anyway * Cleanup isAvailable, isUnavailbe * Misc minor	2012-12-06 15:26:57 +01:00
Patrik Nordwall	1914be7069	Merge branch 'master' into wip-2547-metrics-router-patriknw Conflicts: akka-actor/src/main/scala/akka/actor/Deployer.scala akka-cluster/src/main/scala/akka/cluster/ClusterMetricsCollector.scala akka-cluster/src/test/scala/akka/cluster/MetricsCollectorSpec.scala	2012-11-15 12:33:11 +01:00
Roland	bff79c2f94	Merge remote-tracking branch 'origin/master' into wip-2.10.0-RC1-∂π - currently cheating: uses zeroMQ artifacts for scala 2.10M7 - fixed a bunch of more wrong references to scala.concurrent.util	2012-10-15 16:18:52 +02:00
Roland	0f04239f67	move Duration classes according to scala 2.10 nightly and remove casts to FiniteDuration, see #2504	2012-10-11 15:18:10 -07:00
Patrik Nordwall	668d5a5013	Merge branch 'master' into wip-2284-heartbeat-scalability-patriknw Conflicts: akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala	2012-10-09 18:11:36 +02:00
Patrik Nordwall	1f3341713f	Remove cluster.FixedRateTask, see #2606	2012-10-08 12:17:40 +02:00
Patrik Nordwall	3f73705abc	Use consistent hash to heartbeat to a few nodes instead of all, see #2284 * Previously heartbeat messages was sent to all other members, i.e. each member was monitored by all other members in the cluster. * This was the number one know scalability bottleneck, due to the number of interconnections. * Limit sending of heartbeats to a few (5) members. Select and re-balance with consistent hashing algorithm when new members are added or removed. * Send a few EndHeartbeat when ending send of Heartbeat messages.	2012-10-08 08:41:28 +02:00
Patrik Nordwall	cecde67226	Move heartbeat sending out from ClusterCoreDaemon, see #2284	2012-10-08 08:41:28 +02:00
Patrik Nordwall	49b9ec6c2c	Publish cluster metrics through the publisher actor. * To avoid ordering surprises metrics should be published via the same actor that handles the subscriptions and publishes other cluster domain events. * Added missing publish in case of removal of member (had a test failure for that)	2012-10-02 17:08:38 +02:00
Patrik Nordwall	51ff9ce6d1	Cluster.unsubscribe with class parameter, see #2567	2012-09-28 13:09:36 +02:00
Helena Edelson	dbce1c8b85	Cluster metrics internal API and cluster-wide transport of metrics data. * Create Cluster Metrics API * Create transport of relevant metrics data Does not include load-balancing routers.	2012-09-24 13:07:11 -06:00
Roland	35b7a9e338	second round of FiniteDuration business, including cluster fixes - make Scheduler only accept FiniteDuration, which has quite some knock-on effects	2012-09-18 09:58:30 +02:00
Patrik Nordwall	50d0efe7d4	Request send/publish of CurrentClusterState, see #2438 * Added publishCurrentClusterState and sendCurrentClusterState * Removed Ping/Pong that was used for some tests, since awaitCond is now needed anyway, since publish to eventStream is done afterwards	2012-09-12 09:23:02 +02:00
Patrik Nordwall	83e7f5d6d6	Incorparate review comments, see #2473 * Also added ClusterSettings in constructor of ClusterDaemon, because that will be needed to decide if the metrics actor is to be started	2012-09-07 17:42:15 +02:00
Patrik Nordwall	bd6c39178c	Fix leaking this in constructor of Cluster, see #2473 * Major refactoring to remove the need to use special Cluster instance for testing. Use default Cluster extension instead. Most of it is trivial changes. * Used failure-detector.implementation-class from config to swap to Puppet * Removed FailureDetectorStrategy, since it doesn't add any value * Added Cluster.joinSeedNodes to be able to test seedNodes when Addresses are unknown before startup time. * Removed ClusterEnvironment that was passed around among the actors, instead they use the ordinary Cluster extension. * Overall much cleaner design	2012-09-06 21:48:40 +02:00
Patrik Nordwall	602852ba12	Some clarifications from review, see #1916	2012-08-31 12:27:17 +02:00
Patrik Nordwall	20a038fdfd	Fine grained events, see #2202 * Defined the domain events in ClusterEvent.scala file * Produce events from diff and publish publish to event bus from separate actor, ClusterDomainEventPublisher * Adjustments of tests	2012-08-21 15:35:38 +02:00
Patrik Nordwall	6d1631aa8c	Members ordered by address only, see #2405 * The special ordering of status Exiting makes ordering and equals inconsistent * Take the Exiting status into account when looking for leader	2012-08-19 21:48:39 +02:00
Patrik Nordwall	1700edb863	Merge branch 'master' into wip-2202-cluster-domain-events-patriknw Conflicts: akka-cluster/src/main/scala/akka/cluster/ClusterDaemon.scala	2012-08-16 18:54:10 +02:00
Patrik Nordwall	846b8543fb	Make SeenChanged internal api, see #2202	2012-08-16 10:08:07 +02:00
Patrik Nordwall	4501120ff3	Incorporate review comments, see #2270	2012-08-15 17:31:36 +02:00
Patrik Nordwall	f3035bf8b7	sealed ClusterDomainEvent, see #2202	2012-08-15 17:28:10 +02:00
Patrik Nordwall	06f81f4373	Improve publish of domain events, see #2202 * Gossip is not exposed in user api * Better and more events * Snapshot event sent to new subscriber * Updated tests * Periodic publish only for internal stats	2012-08-15 16:47:34 +02:00
Patrik Nordwall	963c9a4e3e	Clarify JoinSeedNodeProcess, see #2270 * Implemented without ScatterGatherFirstCompletedRouter, since that is more straightforward and might cause less confusion * Added more description of what it does	2012-08-15 08:21:39 +02:00
Patrik Nordwall	dfba5839c6	sealed Tick, see #2270	2012-08-14 17:30:49 +02:00
Patrik Nordwall	4f1f900e40	Join seed node in separate actor, see #2270	2012-08-14 17:26:33 +02:00
Patrik Nordwall	bc4d8fc7c5	Remove ClusterEventBus and system.eventStream, see #2202	2012-08-14 15:33:34 +02:00
Patrik Nordwall	d7b0089d7e	Support concurrent startup of seed nodes, see #2270 * Implemented the startup sequence of seed nodes as described in #2305 * Test that verifies concurrent startup of seed nodes	2012-08-14 15:16:23 +02:00
Patrik Nordwall	e38dd80f38	Publish cluster changes to event bus, see #2202 * ClusterEventBus * Removed register listener and related * Removed Gossip.meta because it doesn't handle version conflicts	2012-08-14 11:03:30 +02:00

1 2

58 commits