pekko

Author	SHA1	Message	Date
Johannes Rudolph	0251886111	=clu add comments for Reachability methods	2017-07-04 12:58:28 +02:00
Johan Andrén	164387a89e	[WIP] one leader per cluster team (#23239 ) * Guarantee no sneaky type puts more teams in the role list * Leader per team and initial tests * MiMa filters * Second iteration (not working though) * Verbose gossip logging etc. * Gossip to team-nodes even if there is inter-team unreachability * More work ... * Marking removed nodes with tombstones in Gossip * More test coverage for Gossip.remove * Bug failing other multi-node tests squashed * Multi-node test for team-split * Review fixes - only prune tombstones on leader ticks * Clean code is happy code. * All I want is for MiMa to be my friend * These constants are internal * Making the formatting gods happy * I used the wrong reachability for ignoring gossip :/ * Still hadn't quite gotten how reachability was supposed to work * Review feedback applied * Cross-team downing should still work * Actually prune tombstones in the prune tombstones method ... * Another round against reachability. Reachability leading with 15 - 2 so far.	2017-07-04 10:09:40 +02:00
Arnout Engelen	0115d5fdda	Less abbreviations, more reliable test (cherry picked from commit 61e289b276f410654c1b063c33648e0d7ea88e50)	2017-07-03 10:47:21 +02:00
Arnout Engelen	2f11ec6f25	Introduce cluster 'team' setting and add to Member Introduced cluster-team.md so we can grow the documentation with each PR, but did not add it to the ToC yet. (cherry picked from commit a06badaa03fa9f3c9a942b1468090f758c74a869)	2017-07-03 10:47:14 +02:00
Patrik Nordwall	a7dc938188	Revert "Introduce cluster 'team' setting and add to Member" This reverts commit a06badaa03fa9f3c9a942b1468090f758c74a869.	2017-07-03 10:44:36 +02:00
Patrik Nordwall	bd6afb8952	Revert "Less abbreviations, more reliable test" This reverts commit 61e289b276f410654c1b063c33648e0d7ea88e50.	2017-07-03 10:44:24 +02:00
Arnout Engelen	9f78cd12c4	Introduce cluster 'team' setting and add to Member (#23234 ) * Introduce cluster 'team' setting and add to Member Introduced cluster-team.md so we can grow the documentation with each PR, but did not add it to the ToC yet. * Less abbreviations, more reliable test	2017-06-26 16:28:06 +02:00
Patrik Nordwall	edef9e34c7	serialize-creators=off in tests, #23003	2017-05-22 20:11:03 +02:00
Nafer Sanabria	ef76af7add	=cls add logging info on seed node joining (#22724 ) * =cls add logging info on seed node joining * adjust message	2017-05-19 14:20:29 +02:00
Philippus Baalman	ef9c7313b6	Extend copyright into 2017 (#22833 )	2017-05-04 15:14:33 +02:00
Patrik Nordwall	8e57304c7d	update to Aeron 1.2.5, and fix the SharedMediaDriverSupport	2017-04-18 15:16:01 +02:00
Patrik Nordwall	3b53daa370	Revert "update to Aeron 1.2.4, and fix the SharedMediaDriverSupport, #22693 " This reverts commit `3d0d50e98b`.	2017-04-12 07:38:02 +02:00
Patrik Nordwall	41c756f169	properly shutdown ArteryTransport using CoordinatedShutdown, #22671 (#22698 ) * properly shutdown ArteryTransport using CoordinatedShutdown, #22671 * The shutdownHook changed hasBeenShutdown flag to true, and then when the transport.shutdown was invoked the shutdown sequence was ignored until it was too late, ActorSystem already terminated. * Also improved the cluster shutdown tasks when the cluster node had not joined * CoordinatedShutdownLeave explicit events	2017-04-11 21:48:51 +02:00
Patrik Nordwall	3d0d50e98b	update to Aeron 1.2.4, and fix the SharedMediaDriverSupport, #22693 * SharedMediaDriverSupport failed with NPE with Aeron 1.2.4, and concludeAeronDirectory solves that	2017-04-11 18:30:18 +02:00
Hawstein	6434cbe868	Re-implement javadsl testkit (#22240 ) * re-implement javadsl testkit * fix mima problem * rebase master * move ImplicitSender/DefaultTimeout to scaladsl * undo the change of moving scala api * fix return type and add doc * resolve conflicts and add more comments	2017-03-16 20:02:47 +01:00
Johan Andrén	3643f18ded	Protobuf serializers for remote deployment #22332	2017-03-16 15:12:35 +01:00
Richard Imaoka	cc1312922c	Allow multiple Cluster JMX MBeans in the same JVM (#22484 ) * Allow multiple Cluster JMX MBeans in the same JVM (#18772) * Remove unnecessary whitespace	2017-03-14 14:31:58 +01:00
Devis Lucato	b89008bdaf	Fix "attmpts" typo	2017-03-01 12:44:32 +01:00
Martynas Mickevičius	1754625202	#22353 fix mbean expected json format	2017-02-21 13:05:36 +02:00
Richard Imaoka	6936c09e4e	Fix JSON formatting of the jmx-cluster/akka-cluster tool #21250	2017-02-20 14:55:43 +01:00
Johan Andrén	52a20f2ba9	Micro kernel module removed #22205	2017-01-26 15:40:54 +01:00
Patrik Nordwall	4703e30774	disable weakly-up for some tests	2017-01-25 07:20:24 +01:00
Patrik Nordwall	94e40460a4	Merge pull request #22206 from akka/wip-21423-remove-deprecations-patriknw remove deprecations, #21423	2017-01-24 16:45:31 +01:00
Patrik Nordwall	db74c33130	remove deprecated constructor in serializers, #21423	2017-01-24 13:34:05 +01:00
Patrik Nordwall	1700cdaebc	Promote WeaklyUp and enable by default, #22197	2017-01-24 12:31:32 +01:00
Patrik Nordwall	af142f82fd	change router type in cluster.StressSpec * it was an oversight when old cluster metrics was removed	2017-01-23 21:18:25 +01:00
Patrik Nordwall	452b3f1406	remove old deprecated cluster metrics, #21423 * corresponding was moved to akka-cluster-metrics, see http://doc.akka.io/docs/akka/2.4/project/migration-guide-2.3.x-2.4.x.html#New_Cluster_Metrics_Extension	2017-01-20 13:48:36 +01:00
Patrik Nordwall	6c8a69109a	Merge pull request #22138 from VEINHORN/master Remove unnecessary new keywords	2017-01-17 19:31:45 +01:00
Patrik Nordwall	84ade6fdc3	add CoordinatedShutdown, #21537 * CoordinatedShutdown that can run tasks for configured phases in order (DAG) * coordinate handover/shutdown of singleton with cluster exiting/shutdown * phase config obj with depends-on list * integrate graceful leaving of sharding in coordinated shutdown * add timeout and recover * add some missing artery ports to tests * leave via CoordinatedShutdown.run * optionally exit-jvm in last phase * run via jvm shutdown hook * send ExitingConfirmed to leader before shutdown of Exiting to not have to wait for failure detector to mark it as unreachable before removing * the unreachable signal is still kept as a safe guard if message is lost or leader dies * PhaseClusterExiting vs MemberExited in ClusterSingletonManager * terminate ActorSystem when cluster shutdown (via Down) * add more predefined and custom phases * reference documentation * migration guide * problem when the leader order was sys2, sys1, sys3, then sys3 could not perform it's duties and move Leving sys1 to Exiting because it was observing sys1 as unreachable * exclude Leaving with exitingConfirmed from convergence condidtion	2017-01-16 09:01:57 +01:00
VEINHORN	0eac4d413b	removed unnecessary new keywords	2017-01-13 12:35:05 +03:00
Patrik Nordwall	180361868c	Merge pull request #22054 from akka/wip-22053-log-join-retry-patriknw log join retries, #22053	2017-01-09 14:18:40 +01:00
Philippus Baalman	6c7085252a	extended copyright into 2017	2017-01-04 17:37:15 +01:00
Patrik Nordwall	645ae4cb31	log join retries, #22053	2016-12-21 16:15:56 +01:00
Patrik Nordwall	e494ec2183	catch NotSerializableException from deserialization, #20641 * to be able to introduce new messages and still support rolling upgrades, i.e. a cluster of mixed versions * note that it's only catching NotSerializableException, which we already use for unknown serializer ids and class manifests * note that it is not catching for system messages, since that could result in infinite resending	2016-12-16 20:14:37 +01:00
Patrik Nordwall	1a12e950ff	Reachability.remove didn't always remove all, #22012 * the versions table in Reachability was not cleared if the records for removed node had been pruned, i.e. all reachable again	2016-12-16 12:25:37 +01:00
Patrik Nordwall	f6a1fba824	=clu don't use Down member as leader, #21906 (#21990 ) * in the failed test it was noticed that a Down member removed itself in leaderActionsOnConvergence which resulted in later "Failed to serialize Gossip, Unknown address" * never use member with status Down as leader * a node will anyway shutdown itself when it's Down, but leader actions could happen before that	2016-12-13 10:53:39 +01:00
Patrik Nordwall	dce668771e	fix shutdown of pending StressSpec, #21960 (#21963 )	2016-12-07 15:38:11 +01:00
Patrik Nordwall	2ef6457311	enable NodeChurnSpec, #21483 * Verify that it actually fails with classic remoting if vector clocks are not pruned * Make it pass with Artery, but it is not verifying the message sizes yet. We should implement that with a custom RemoteInstrument, but that can be done in separate PR. * Still pending with Artery because it still fails on jenkins * barrier after sys shutdown (cherry picked from commit d5edcbea35ca5b43ca4cfb3018602dd555402f42)	2016-12-05 14:27:12 +01:00
Patrik Nordwall	446c0545ec	member accessor in ReachabilityEvent, #21944 (#21947 )	2016-12-05 12:07:18 +01:00
Patrik Nordwall	e04444567f	Speedup pull request validation * speedup ActorCreationPerfSpec * reduce iterations in ConsistencySpec * tag SupervisorHierarchySpec as LongRunningTest * various small speedups and tagging in actor-tests * speedup expectNoMsg in stream-tests * tag FramingSpec, and reduce iterations * speedup QueueSourceSpec * tag some stream-tests * reduce iterations in persistence.PerformanceSpec * reduce iterations in some cluster perf tests * tag RemoteWatcherSpec * tag InterpreterStressSpec * remove LongRunning from ClusterConsistentHashingRouterSpec * sys property to disable multi-jvm tests in test * actually disable multi-node tests in validatePullRequest * doc sbt flags in CONTRIBUTING	2016-11-30 14:31:06 +01:00
Johan Andrén	2679be5ae4	Disable serialization warnings in akka test suites #21882	2016-11-23 12:02:36 +01:00
Patrik Nordwall	e101fe1232	Merge pull request #21869 from akka/wip-21810-pending-patriknw mark StressSpec pending for Artery until we fix it, #21810	2016-11-18 15:44:49 +01:00
Patrik Nordwall	cc170df4d2	mark StressSpec pending for Artery until we fix it, #21810	2016-11-18 13:06:33 +01:00
Patrik Nordwall	68383b5001	harden cluster leaving, #21847 As documented in the code: // Leader is moving itself from Leaving to Exiting. Let others know (best effort) // before shutdown. Otherwise they will not see the Exiting state change // and there will not be convergence until they have detected this node as // unreachable and the required downing has finished. They will still need to detect // unreachable, but Exiting unreachable will be removed without downing, i.e. // normally the leaving of a leader will be graceful without the need // for downing. However, if those final gossip messages never arrive it is // alright to require the downing, because that is probably caused by a // network failure anyway. That is fine, but this change improves the selection of the nodes to send the final gossip messages to. I could reproduce the failure in ClusterSingletonManagerLeaveSpec and with additional logging I verified that in the failure case it picked the "first" node 3 times (it's random) and that node had already been shutdown (left earlier in the test) but was not removed yet.	2016-11-18 12:33:42 +01:00
Patrik Nordwall	136e64b253	use longUid in ClusterRemoteWatcher, #21594 * found by test failure in SurviveNetworkInstabilitySpec	2016-09-30 10:51:51 +02:00
Johan Andrén	0f376e751e	Quarantine gracefully downed node after some time (#21534 ) * New setting for quarantining after graceful leave	2016-09-28 14:04:58 +02:00
Patrik Nordwall	86d912a299	Merge pull request #21555 from akka/wip-21522-StressSpec-patriknw increase acceptable-heartbeat-pause in StressSpec, #21522	2016-09-26 19:21:07 +02:00
Johan Andrén	8ae0c9a888	Use long uid in artery remoting and cluster #20644	2016-09-26 15:34:59 +02:00
Patrik Nordwall	d91ddb7891	increase acceptable-heartbeat-pause in StressSpec, #21522	2016-09-23 15:50:32 +02:00
Patrik Nordwall	63917c1947	Merge pull request #21513 from akka/wip-21512-quick-restart-patriknw fix problem with quick restart, #21512	2016-09-22 18:33:22 +02:00

1 2 3 4 5 ...

1389 commits