pekko

Author	SHA1	Message	Date
Patrik Nordwall	23fa8b0810	change spelling of behaviour to behavior, #24457	2018-02-01 15:10:46 +01:00
Christopher Batey	5a37cdc862	Cross DC gossip fixes #23803 * Adjust cross DC gossip probability for small nr of nodes in a DC When a Dc is being bootstrapped the initial node has no local peers and can not gossip if it selects a local gossip round. Start at a probability of 1.0 for a single node cluster and move down 0.25 per node until a 5 node DC is reached then use the cross-data-center-gossip-probability * Fix cross DC gossip selecting of oldest members This used to select the members based on the sort order members in Gossip (by address) rather than by upNumber	2017-11-02 09:17:24 +01:00
Patrik Nordwall	4f8856f108	Merge pull request #23551 from akka/wip-23502-join-timeout-patriknw Add timeout to abort joining of seed nodes, #23502	2017-09-11 16:41:35 +02:00
Patrik Nordwall	5cf698a2f6	Add timeout to abort joining of seed nodes, #23502	2017-09-11 15:56:25 +02:00
Patrik Nordwall	1e4e7cbba2	Merge pull request #23583 from akka/wip-multi-dc-merge-master-patriknw merge wip-multi-dc-dev back to master	2017-09-01 17:08:28 +02:00
Patrik Nordwall	6ed3295acd	Merge branch 'master' into wip-multi-dc-merge-master-patriknw	2017-08-31 10:51:12 +02:00
Patrik Nordwall	6753c1e624	Don't use WeaklyUp immediately, #23554 * see description in issue	2017-08-22 12:02:04 +02:00
Sébastien Lorion	a95a94acff	Replace ClusterRouterGroup/Pool "use-role" with "use-role-set" #23496	2017-08-09 16:06:18 +02:00
Johan Andrén	9c7e8d027a	Renamed/moved the self data center setting #23312 (#23344 )	2017-07-12 11:47:32 +01:00
Johan Andrén	c0d439eac3	limit cross dc gossip #23282	2017-07-07 13:19:10 +01:00
Konrad `ktoso` Malawski	b568975acc	=clu #23229 multi-dc heartbeating, only N nodes perform monitoring	2017-07-07 12:17:41 +01:00
Patrik Nordwall	bb9549263e	Rename team to data center, #23275	2017-07-04 17:11:21 +02:00
Johan Andrén	164387a89e	[WIP] one leader per cluster team (#23239 ) * Guarantee no sneaky type puts more teams in the role list * Leader per team and initial tests * MiMa filters * Second iteration (not working though) * Verbose gossip logging etc. * Gossip to team-nodes even if there is inter-team unreachability * More work ... * Marking removed nodes with tombstones in Gossip * More test coverage for Gossip.remove * Bug failing other multi-node tests squashed * Multi-node test for team-split * Review fixes - only prune tombstones on leader ticks * Clean code is happy code. * All I want is for MiMa to be my friend * These constants are internal * Making the formatting gods happy * I used the wrong reachability for ignoring gossip :/ * Still hadn't quite gotten how reachability was supposed to work * Review feedback applied * Cross-team downing should still work * Actually prune tombstones in the prune tombstones method ... * Another round against reachability. Reachability leading with 15 - 2 so far.	2017-07-04 10:09:40 +02:00
Arnout Engelen	0115d5fdda	Less abbreviations, more reliable test (cherry picked from commit 61e289b276f410654c1b063c33648e0d7ea88e50)	2017-07-03 10:47:21 +02:00
Arnout Engelen	2f11ec6f25	Introduce cluster 'team' setting and add to Member Introduced cluster-team.md so we can grow the documentation with each PR, but did not add it to the ToC yet. (cherry picked from commit a06badaa03fa9f3c9a942b1468090f758c74a869)	2017-07-03 10:47:14 +02:00
Patrik Nordwall	a7dc938188	Revert "Introduce cluster 'team' setting and add to Member" This reverts commit a06badaa03fa9f3c9a942b1468090f758c74a869.	2017-07-03 10:44:36 +02:00
Patrik Nordwall	bd6afb8952	Revert "Less abbreviations, more reliable test" This reverts commit 61e289b276f410654c1b063c33648e0d7ea88e50.	2017-07-03 10:44:24 +02:00
Arnout Engelen	9f78cd12c4	Introduce cluster 'team' setting and add to Member (#23234 ) * Introduce cluster 'team' setting and add to Member Introduced cluster-team.md so we can grow the documentation with each PR, but did not add it to the ToC yet. * Less abbreviations, more reliable test	2017-06-26 16:28:06 +02:00
Johan Andrén	3643f18ded	Protobuf serializers for remote deployment #22332	2017-03-16 15:12:35 +01:00
Richard Imaoka	cc1312922c	Allow multiple Cluster JMX MBeans in the same JVM (#22484 ) * Allow multiple Cluster JMX MBeans in the same JVM (#18772) * Remove unnecessary whitespace	2017-03-14 14:31:58 +01:00
Patrik Nordwall	1700cdaebc	Promote WeaklyUp and enable by default, #22197	2017-01-24 12:31:32 +01:00
Patrik Nordwall	452b3f1406	remove old deprecated cluster metrics, #21423 * corresponding was moved to akka-cluster-metrics, see http://doc.akka.io/docs/akka/2.4/project/migration-guide-2.3.x-2.4.x.html#New_Cluster_Metrics_Extension	2017-01-20 13:48:36 +01:00
Patrik Nordwall	84ade6fdc3	add CoordinatedShutdown, #21537 * CoordinatedShutdown that can run tasks for configured phases in order (DAG) * coordinate handover/shutdown of singleton with cluster exiting/shutdown * phase config obj with depends-on list * integrate graceful leaving of sharding in coordinated shutdown * add timeout and recover * add some missing artery ports to tests * leave via CoordinatedShutdown.run * optionally exit-jvm in last phase * run via jvm shutdown hook * send ExitingConfirmed to leader before shutdown of Exiting to not have to wait for failure detector to mark it as unreachable before removing * the unreachable signal is still kept as a safe guard if message is lost or leader dies * PhaseClusterExiting vs MemberExited in ClusterSingletonManager * terminate ActorSystem when cluster shutdown (via Down) * add more predefined and custom phases * reference documentation * migration guide * problem when the leader order was sys2, sys1, sys3, then sys3 could not perform it's duties and move Leving sys1 to Exiting because it was observing sys1 as unreachable * exclude Leaving with exitingConfirmed from convergence condidtion	2017-01-16 09:01:57 +01:00
Johan Andrén	0f376e751e	Quarantine gracefully downed node after some time (#21534 ) * New setting for quarantining after graceful leave	2016-09-28 14:04:58 +02:00
Patrik Nordwall	0a75f992e4	Update links to Lightbend RPv2, more warnings about auto-down	2016-09-02 10:26:47 +02:00
Johan Andrén	5671927cf1	clu #20309 API for pluggable cluster downing	2016-04-18 15:06:05 +02:00
Konrad Malawski	35108384c9	=clt #19381 silence heartbeat logging in cluster client	2016-01-08 12:11:56 +01:00
Martynas Mickevičius	fb664c54a5	=doc fix URL to "The ϕ Accrual Failure Detector" paper	2015-11-04 16:26:45 +02:00
Patrik Nordwall	22b8853314	=clu #13584 mark as experimental and some doc clarificiations	2015-09-04 14:09:41 +02:00
Veiga Ortiz, Héctor	c08bc317e2	+clu #13584 Accept joining to be WeaklyUp during network split * experimental feature, disabled by default * Adding documentation to mention weakly up members. plus adding new diagram.	2015-09-04 12:44:47 +02:00
Patrik Nordwall	bfde1eff19	=clu #18337 Disable down-removal-margin by default For manual downing it is not needed. For auto-down it doesn't add any extra safety, since that is not handling network partitions anyway. The setting is still useful if you implement downing strategies that handle network partitions, e.g. by keeping the larger side of the partition and shutting down the smaller side.	2015-09-04 11:28:33 +02:00
Patrik Nordwall	bc13e1b4c2	=clu #13802 Introduce max-total-nr-of-instances for cluster aware routers	2015-08-21 14:51:59 +02:00
Patrik Nordwall	f72b1bea9f	=rem,clu #17750 Decrease default expected-response-after	2015-08-19 07:34:24 +02:00
Patrik Nordwall	2a88f4fb29	=clu Improve cluster downing * avoid using Down and Exiting member from being used for joining * delay shut down of Down member until the information is spread to all reachable members, e.g. downing several nodes via one node * akka.cluster.down-removal-margin setting Margin until shards or singletons that belonged to a downed/removed partition are created in surviving partition. Used by singleton and sharding. * remove the retry count parameters/settings for singleton in favor of deriving those from the removal-margin	2015-06-18 12:55:54 +02:00
Patrik Nordwall	96c84a1df6	=rem #17567 Adjust parameters for DeadlineFailureDetector To be more aligned with PhiAccrualFailureDetector the DeadlineFailureDetector should trigger after heartbeat-interval + acceptable-heartbeat-pause	2015-05-29 10:20:42 +02:00
Andrei Pozolotin	6332f888ce	+all #16632 Make serialization identifiers configurable in reference.conf	2015-03-05 11:55:05 -06:00
Patrik Nordwall	1e445b4eba	!act,rem,clu #3920 Remove deprecated old routers	2014-03-14 14:12:11 +01:00
Patrik Nordwall	b5be06e90c	!clu #3920 Remove deprecated akka.cluster.auto-down * replaced by akka.cluster.auto-down-unreachable-after	2014-03-14 14:11:28 +01:00
Patrik Nordwall	4b843476ef	=clu,rem #3632 Correct wrong transport in docs	2014-01-21 15:14:27 +01:00
Patrik Nordwall	eaad7ecf7e	!clu #3683 Change cluster heartbeat to req/rsp protocol * The previous one-way hearbeat was elegant, but comlicated to understand and without giving much extra value compared to this approach. * The previous one-way heartbeat have some kind of bug when joining several (10-20) nodes at approximately the same time (but not exactly the same time) with a false failure detection triggered by the extra heartbeat, which would not heal. * This ping-pong approach will increase network traffic slightly, but heartbeat messages are small and each node is limited to monitor (default) 5 peers.	2013-11-15 08:18:52 +01:00
Patrik Nordwall	3bdac872ff	=clu #3683 Don't trigger extra heartbeat when not expected sender	2013-10-22 14:48:37 +02:00
Patrik Nordwall	ff83edea0b	Merge pull request #1785 from akka/wip-3458-adjust-biased-gossip-patriknw +clu #3458 Adjust biased gossip for large cluster	2013-10-18 07:58:50 -07:00
Patrik Nordwall	532c98c6cd	+clu #3458 Adjust biased gossip for large cluster	2013-10-18 14:34:36 +02:00
Patrik Nordwall	7d5a3ec30b	!clu #3657 Lazy deserialization and TTL of Gossip message payload	2013-10-18 08:29:46 +02:00
Patrik Nordwall	402674ce10	+clu #3627 Cluster router group with multiple paths per node * Use the ordinary routees.paths config property instead of cluster.routees-path * Backwards compatible in deprecation phase	2013-10-16 11:44:00 +02:00
Patrik Nordwall	ebadd567b2	!act,rem,clu #3549 Simplify and enhance routers * Separate routing logic, to be usable stand alone, e.g. in actors * Simplify RouterConfig, only a factory * Move reading of config from Deployer to the RouterConfig * Distiction between Pool and Group router types * Remove usage of actorFor, use ActorSelection * Management messages to add and remove routees * Simplify the internals of RoutedActorCell & co * Move resize specific code to separate RoutedActorCell subclass * Change resizer api to only return capacity change * Resizer only allowed together with Pool * Re-implement all routers, and keep old api during deprecation phase * Replace ClusterRouterConfig, deprecation * Rewrite documentation * Migration guide * Also includes related ticket: +act #3087 Create nicer Props factories for RouterConfig	2013-10-16 09:27:13 +02:00
Patrik Nordwall	d5b25cbbc6	!act #3583 Timer based auto-down * Replace (deprecate) akka.cluster.auto-down config setting with akka.cluster.auto-down-unreachable-after * AutoDown actor that keeps track of unreachable members and performs down from the leader node when they have been unreachable for the specified duration * Migration guide	2013-09-27 14:32:03 +02:00
Patrik Nordwall	dc9fe4f19c	!clu #2307 Allow transition from unreachable to reachable * Replace unreachable Set with Reachability table * Unreachable members stay in member Set * Downing a live member was moved it to the unreachable Set, and then removed from there by the leader. That will not work when flipping back to reachable, so a Down member must be detected as unreachable before beeing removed. Similar to Exiting. Member shuts down itself if it sees itself as Down. * Flip back to reachable when failure detector monitors it as available again * ReachableMember event * Can't ignore gossip from aggregated unreachable (see SurviveNetworkInstabilitySpec) * Make use of ReachableMember event in cluster router * End heartbeat when acknowledged, EndHeartbeatAck * Remove nr-of-end-heartbeats from conf * Full reachability info in JMX cluster status * Don't use interval after unreachable for AccrualFailureDetector history * Add QuarantinedEvent to remoting, used for Reachability.Terminated * Prune reachability table when all reachable * Update documentation * Performance testing and optimizations	2013-09-11 13:10:29 +02:00
Patrik Nordwall	8c2859ad03	Make akka.cluster.MetricsCollector public, see #3452	2013-06-18 15:07:26 +02:00
Patrik Nordwall	95366cb585	Wrap long lines, for pdf	2013-05-30 14:45:15 +02:00

1 2 3

121 commits