pekko

Author	SHA1	Message	Date
Konrad `ktoso` Malawski	b568975acc	=clu #23229 multi-dc heartbeating, only N nodes perform monitoring	2017-07-07 12:17:41 +01:00
Patrik Nordwall	867cc97bdd	Refactoring of Gossip class, #23290 * move methods that depends on selfUniqueAddress and selfDc to a separate MembershipState class, which also holds the latest gossip * this removes the need to pass in the parameters from everywhere and makes it easier to cache some results * makes it clear that those parameters are always selfUniqueAddress and selfDc, instead of some arbitary node/dc	2017-07-05 08:47:32 +02:00
Patrik Nordwall	bb9549263e	Rename team to data center, #23275	2017-07-04 17:11:21 +02:00
Johan Andrén	164387a89e	[WIP] one leader per cluster team (#23239 ) * Guarantee no sneaky type puts more teams in the role list * Leader per team and initial tests * MiMa filters * Second iteration (not working though) * Verbose gossip logging etc. * Gossip to team-nodes even if there is inter-team unreachability * More work ... * Marking removed nodes with tombstones in Gossip * More test coverage for Gossip.remove * Bug failing other multi-node tests squashed * Multi-node test for team-split * Review fixes - only prune tombstones on leader ticks * Clean code is happy code. * All I want is for MiMa to be my friend * These constants are internal * Making the formatting gods happy * I used the wrong reachability for ignoring gossip :/ * Still hadn't quite gotten how reachability was supposed to work * Review feedback applied * Cross-team downing should still work * Actually prune tombstones in the prune tombstones method ... * Another round against reachability. Reachability leading with 15 - 2 so far.	2017-07-04 10:09:40 +02:00
Nafer Sanabria	ef76af7add	=cls add logging info on seed node joining (#22724 ) * =cls add logging info on seed node joining * adjust message	2017-05-19 14:20:29 +02:00
Patrik Nordwall	41c756f169	properly shutdown ArteryTransport using CoordinatedShutdown, #22671 (#22698 ) * properly shutdown ArteryTransport using CoordinatedShutdown, #22671 * The shutdownHook changed hasBeenShutdown flag to true, and then when the transport.shutdown was invoked the shutdown sequence was ignored until it was too late, ActorSystem already terminated. * Also improved the cluster shutdown tasks when the cluster node had not joined * CoordinatedShutdownLeave explicit events	2017-04-11 21:48:51 +02:00
Devis Lucato	b89008bdaf	Fix "attmpts" typo	2017-03-01 12:44:32 +01:00
Patrik Nordwall	452b3f1406	remove old deprecated cluster metrics, #21423 * corresponding was moved to akka-cluster-metrics, see http://doc.akka.io/docs/akka/2.4/project/migration-guide-2.3.x-2.4.x.html#New_Cluster_Metrics_Extension	2017-01-20 13:48:36 +01:00
Patrik Nordwall	84ade6fdc3	add CoordinatedShutdown, #21537 * CoordinatedShutdown that can run tasks for configured phases in order (DAG) * coordinate handover/shutdown of singleton with cluster exiting/shutdown * phase config obj with depends-on list * integrate graceful leaving of sharding in coordinated shutdown * add timeout and recover * add some missing artery ports to tests * leave via CoordinatedShutdown.run * optionally exit-jvm in last phase * run via jvm shutdown hook * send ExitingConfirmed to leader before shutdown of Exiting to not have to wait for failure detector to mark it as unreachable before removing * the unreachable signal is still kept as a safe guard if message is lost or leader dies * PhaseClusterExiting vs MemberExited in ClusterSingletonManager * terminate ActorSystem when cluster shutdown (via Down) * add more predefined and custom phases * reference documentation * migration guide * problem when the leader order was sys2, sys1, sys3, then sys3 could not perform it's duties and move Leving sys1 to Exiting because it was observing sys1 as unreachable * exclude Leaving with exitingConfirmed from convergence condidtion	2017-01-16 09:01:57 +01:00
Patrik Nordwall	180361868c	Merge pull request #22054 from akka/wip-22053-log-join-retry-patriknw log join retries, #22053	2017-01-09 14:18:40 +01:00
Philippus Baalman	6c7085252a	extended copyright into 2017	2017-01-04 17:37:15 +01:00
Patrik Nordwall	645ae4cb31	log join retries, #22053	2016-12-21 16:15:56 +01:00
Patrik Nordwall	68383b5001	harden cluster leaving, #21847 As documented in the code: // Leader is moving itself from Leaving to Exiting. Let others know (best effort) // before shutdown. Otherwise they will not see the Exiting state change // and there will not be convergence until they have detected this node as // unreachable and the required downing has finished. They will still need to detect // unreachable, but Exiting unreachable will be removed without downing, i.e. // normally the leaving of a leader will be graceful without the need // for downing. However, if those final gossip messages never arrive it is // alright to require the downing, because that is probably caused by a // network failure anyway. That is fine, but this change improves the selection of the nodes to send the final gossip messages to. I could reproduce the failure in ClusterSingletonManagerLeaveSpec and with additional logging I verified that in the failure case it picked the "first" node 3 times (it's random) and that node had already been shutdown (left earlier in the test) but was not removed yet.	2016-11-18 12:33:42 +01:00
Johan Andrén	8ae0c9a888	Use long uid in artery remoting and cluster #20644	2016-09-26 15:34:59 +02:00
Endre Sándor Varga	5e830323f6	Updating to ScalaTest 3.0.0 and ScalaCheck 1.13.2	2016-08-22 11:13:49 +02:00
Patrik Nordwall	0c4d4c37ba	cluster singleton improvements, #20942 * track nodes by UniqueAddress in Cluster Singleton, #20942 * reply with HandOverDone from new incarnation, #20942 * confirm as terminated immediately when new incarnation joins, #20942 instead of waiting for failure detector to mark it as unreachable this will speed-up removal when restarting cluster node with same hostname:port	2016-08-19 11:56:55 +02:00
Patrik Nordwall	d731f20bf1	suppress deadletter for the cluster joining messages	2016-08-09 17:22:31 +02:00
Björn Antonsson	c66ce62d63	Update to a working version of Scalariform	2016-06-02 22:12:36 +02:00
Yegor Andreenko	c66e3a9f02	=clu #20613 logging selfRoles during node unreachable and quarantined (#20542 )	2016-05-24 14:35:50 +02:00
Johan Andrén	5671927cf1	clu #20309 API for pluggable cluster downing	2016-04-18 15:06:05 +02:00
adebski	472d404bbe	=clu #19859 Relaxed constraints on downing old incarnation of rejoining node. * Automatic downing of old node incarnation when new tries to rejoin the cluster is performed even if old incarnation was left in Leaving or Exiting state. * Added information to clustering docs about automatic downing of old incarnations when new tries to rejoin the cluster.	2016-02-26 20:35:19 +01:00
Johannes Rudolph	b6cbc7f13a	=all remove unused imports	2016-02-23 20:29:22 +01:00
Johan Andrén	62e30b3c08	Update copyrights and links to the new company name #19851	2016-02-23 12:58:39 +01:00
Prayag Verma	b7783968a0	=pro #19068 All copyrights ranges and single years updated to a range ending in 2016	2016-01-25 10:20:30 +01:00
Roland Kuhn	f1abaa1c5e	Merge pull request #18875 from ktoso/wip-akka.js-cherries-ktoso Akka.js cherries to master	2015-11-07 18:01:24 +01:00
Patrik Nordwall	c7c187f6b7	=clu replace Set -- with diff and ++ with union * better performance according to https://docs.google.com/presentation/d/1Qjryxoe-fYEM8ZPhM-98LKfbhnRcn5eAEMNlVVnixsA/pub	2015-11-06 14:48:17 +01:00
Andrea	cd3d68a77c	=act switch to java std lib ThreadLocalRandom	2015-11-06 14:04:33 +01:00
Patrik Nordwall	9380983d3c	=clu #18554 Make oldest assignment deterministic when joining * the reported issue is fixed by the immediate leaderActions (moving to Up) when joining the first node to itself * the other changes are precautions just in case	2015-10-21 07:53:14 +02:00
Veiga Ortiz, Héctor	c08bc317e2	+clu #13584 Accept joining to be WeaklyUp during network split * experimental feature, disabled by default * Adding documentation to mention weakly up members. plus adding new diagram.	2015-09-04 12:44:47 +02:00
Patrik Nordwall	737a50ebf3	=clu #17253 Improve cluster startup thread usage When using a dispatcher (default or separate cluster dispatcher) with less than 5 threads the Cluster extension initialization could deadlock. It was reproducable by adding a sleep before the Await of GetClusterCoreRef in the Cluster extension constructor. The reason was that other cluster actors were started too early and they also tried to get the Cluster extension and thereby blocking dispatcher threads. Note that the Cluster extension is started via ClusterActorRefProvider before ActorSystem.apply returns. The improvement is to start the cluster child actors lazily when the GetClusterCoreRef is received.	2015-09-03 18:09:31 +02:00
Patrik Nordwall	5cf35938d0	=clu #13226 Prune vector clocks from removed member	2015-08-11 15:40:42 +02:00
Roland Kuhn	0de9f0ff40	Merge pull request #17641 from kukido/kukido-spellings-normalization =doc #17329 Fixed and normalized spellings in ScalaDoc and comments	2015-06-19 12:06:53 +02:00
Patrik Nordwall	2a88f4fb29	=clu Improve cluster downing * avoid using Down and Exiting member from being used for joining * delay shut down of Down member until the information is spread to all reachable members, e.g. downing several nodes via one node * akka.cluster.down-removal-margin setting Margin until shards or singletons that belonged to a downed/removed partition are created in surviving partition. Used by singleton and sharding. * remove the retry count parameters/settings for singleton in favor of deriving those from the removal-margin	2015-06-18 12:55:54 +02:00
Andrey Myatlyuk	bc791eb86c	=doc #17329 Fixed and normalized spellings in ScalaDoc and comments	2015-06-02 21:06:25 -07:00
Patrik Nordwall	8a7d7715b5	clu #17565 Invoke OnMemberRemoved callback when cluster.shutdown * must also be done when the listener actor stops before the MemberRemoved event has been received * add test for this * clarify docs with example that shuts down actor system and exit jvm	2015-05-27 15:42:53 +02:00
Roland Kuhn	18688fc84b	= #17380 fix doc comments for java8 doclint * actor and cluster-metrics comments * agent/camel/cluster/osgi/persistence/remote comments * comments in contrib/persistence-tck/multi-node/typed	2015-05-18 12:51:36 +02:00
Patrik Nordwall	aaa620c35e	=clu #17362 Make cluster.joinSeedNodes equivalent to conf seed-nodes * the difference was in the retry of failed join attempt * also clarify the documentation	2015-05-13 10:48:18 +02:00
hepin	ccca503b4d	+clu #16736 add registerOnMemberRemoved to get notified when current member removed from the cluster	2015-05-08 12:58:12 +08:00
Patrik Nordwall	fe98dae650	=clu #13875 Fix regression in leader selection * The leader is selected by picking the first reachable member, but in #13875 we had to let the self member be unreachable in the Reachability table and that was not considered in the logic of the leader selection. * That means changed behavior that is unwanted, especially when there is only one node left the leader could be evaluated to None instead of Some(selfUniqueAddress). * Note that #13875 has not been released yet.	2015-03-14 11:41:28 -07:00
Julian Tescher	00f6a58e7c	Changes all occurances of Typesafe copyright to extend to 2015	2015-03-10 14:12:19 -07:00
Patrik Nordwall	617cd31046	Merge pull request #16792 from akka/wip-16726-down-restarted-patriknw =clu #16726 Down member automatically when restarted	2015-02-13 09:14:44 +01:00
Patrik Nordwall	37f6a6581c	=clu #16726 Down member automatically when restarted * When new uid is seen in join attempt we can down existing member and thereby new restarted node will be able to join in later retried join attempt without relying on auto-down.	2015-02-13 09:14:00 +01:00
Roland Kuhn	5e1fd1db6c	Merge pull request #16763 from akka/wip-cleanup-actor-∂π fix all non-deprecation warnings	2015-02-06 20:54:12 +01:00
Patrik Nordwall	71ccb4c21b	=clu #13875 Exclude unreachability observations from downed * Skip observations from downed node (quarantined is marked down immediately) in convergence check * Skip observations from downed node when picking "reachable" targets for gossip. * This also means that we must accept gossip with own node marked as unreachable, but that should not be spread to the external membership events.	2015-02-06 10:19:48 +01:00
Roland Kuhn	82b8238a9c	fix warnings in remote and cluster	2015-01-30 19:02:18 +01:00
Patrik Nordwall	cc7bcf7978	=clu #3973 Make JoinSeedNodeProcess actor name unique * These names are not used, but for debuggability I prefer real names (cherry picked from commit 1f2be54eebe5feb2f82c2659c8262f1db8343125)	2014-04-07 14:06:38 +02:00
Patrik Nordwall	b5be06e90c	!clu #3920 Remove deprecated akka.cluster.auto-down * replaced by akka.cluster.auto-down-unreachable-after	2014-03-14 14:11:28 +01:00
Patrik Nordwall	503c4ced8f	!clu #3920 Remove deprecated Cluster.publishCurrentClusterState	2014-03-14 14:11:28 +01:00
dario.rexin	2cbad298d6	=all #3858 Make case classes final	2014-03-07 13:20:01 +01:00
Patrik Nordwall	c1f320d621	=clu Remove debug log noise of gossip round * that log entry is not useful	2014-02-20 11:52:33 +01:00

1 2 3 4

152 commits