Commit graph

1248 commits

Author SHA1 Message Date
Patrik Nordwall
e101fe1232 Merge pull request #21869 from akka/wip-21810-pending-patriknw
mark StressSpec pending for Artery until we fix it, #21810
2016-11-18 15:44:49 +01:00
Patrik Nordwall
cc170df4d2 mark StressSpec pending for Artery until we fix it, #21810 2016-11-18 13:06:33 +01:00
Patrik Nordwall
68383b5001 harden cluster leaving, #21847
As documented in the code:

// Leader is moving itself from Leaving to Exiting. Let others know (best effort)
// before shutdown. Otherwise they will not see the Exiting state change
// and there will not be convergence until they have detected this node as
// unreachable and the required downing has finished. They will still need to detect
// unreachable, but Exiting unreachable will be removed without downing, i.e.
// normally the leaving of a leader will be graceful without the need
// for downing. However, if those final gossip messages never arrive it is
// alright to require the downing, because that is probably caused by a
// network failure anyway.

That is fine, but this change improves the selection of the nodes to
send the final gossip messages to.

I could reproduce the failure in ClusterSingletonManagerLeaveSpec and with
additional logging I verified that in the failure case it picked the "first"
node 3 times (it's random) and that node had already been shutdown (left earlier
in the test) but was not removed yet.
2016-11-18 12:33:42 +01:00
Patrik Nordwall
136e64b253 use longUid in ClusterRemoteWatcher, #21594
* found by test failure in SurviveNetworkInstabilitySpec
2016-09-30 10:51:51 +02:00
Johan Andrén
0f376e751e Quarantine gracefully downed node after some time (#21534)
* New setting for quarantining after graceful leave
2016-09-28 14:04:58 +02:00
Patrik Nordwall
86d912a299 Merge pull request #21555 from akka/wip-21522-StressSpec-patriknw
increase acceptable-heartbeat-pause in StressSpec, #21522
2016-09-26 19:21:07 +02:00
Johan Andrén
8ae0c9a888 Use long uid in artery remoting and cluster #20644 2016-09-26 15:34:59 +02:00
Patrik Nordwall
d91ddb7891 increase acceptable-heartbeat-pause in StressSpec, #21522 2016-09-23 15:50:32 +02:00
Patrik Nordwall
63917c1947 Merge pull request #21513 from akka/wip-21512-quick-restart-patriknw
fix problem with quick restart, #21512
2016-09-22 18:33:22 +02:00
Patrik Nordwall
9f175f56de fix problem with quick restart, #21512
* image-liveness-timeout must be less than the handshake-timeout,
  otherwise the publication for the handshake will give up too early
  when previous image is still considered alive
2016-09-21 20:27:04 +02:00
Patrik Nordwall
f1590a59b4 revert quarantine removed (leaving) cluster member, #21509 2016-09-21 17:27:34 +02:00
Patrik Nordwall
1926560e41 stop outbound streams when quarantined, #21407
* they can't be stopped immediately because we want to send
  some final message and we reply to inbound messages with `Quarantined`
* and improve logging
2016-09-21 14:38:13 +02:00
Endre Sándor Varga
8ecd7419ac #21419: Reenable ClusterDeathWatchSpec 2016-09-19 12:48:07 +02:00
Johan Andrén
392ca5ecce Enable flight recorder in tests #21205
* Setting to configure where the flight recorder puts its file
* Run ArteryMultiNodeSpecs with flight recorder enabled
* More cleanup in exit hook, wait for task runner to stop
* Enable flight recorder for the cluster multi node tests
* Enable flight recorder for multi node remoting tests
* Toggle always-dump flight recorder output when akka.remote.artery.always-dump-flight-recorder is set
2016-09-16 15:12:40 +02:00
Patrik Nordwall
835125de3d make cluster.StressSpec pass with Artery, #21458
* need to use a shared media driver to get the cpu usage
  at a reasonable level
* also changed to SleepingIdleStrategy(1 ms) when cpu-level=1
  not needed for the test to pass, but can be good to make level 1
  more extreme
2016-09-16 12:58:41 +02:00
Patrik Nordwall
03eb20e5d2 Merge pull request #21461 from johanandren/wip-more-tests-working-with-artery-johanandren
More tests working on artery
2016-09-14 16:06:01 +02:00
Johan Andrén
848d56cc2f More tests working on artery
* non-multi-jvm tests from akka-cluster
* akka-cluster-metrics
* akka-cluster-tools
* akka-cluster-sharding
2016-09-14 11:40:42 +02:00
Patrik Nordwall
bf151e9793 don't quarantine back, #21450
* Don't quarantine the other system when receiving the Quarantined message,
  since that will result cluster member removal and can result in
  forming two separate clusters (cluster split).
* Instead, the downing strategy should act on ThisActorSystemQuarantinedEvent, e.g.
  use it as a STONITH signal.
2016-09-13 08:01:58 +02:00
Johan Andrén
3502f0d72f One more missed canonical.port in cluster tests (#21428) 2016-09-09 18:12:35 +02:00
Johan Andrén
b0e03058b9 Port and hostname config path was changed, cluster tests didn't get the change (#21427) 2016-09-09 17:55:02 +02:00
Johan Andrén
fa1d6d6f19 Disable ClusterDeathWatchSpec for now (#21421) 2016-09-09 17:54:13 +02:00
Patrik Nordwall
e8ce261faf Merge branch 'master' into wip-sync-2.4.10-patriknw 2016-09-09 14:12:16 +02:00
Patrik Nordwall
3b7a7dfa59 add reason param to quarantine method 2016-09-08 18:00:37 +02:00
Johan Andrén
90193907fe Make cluster tests run with artery #21204 2016-09-07 16:41:03 +02:00
Patrik Nordwall
0a75f992e4 Update links to Lightbend RPv2, more warnings about auto-down 2016-09-02 10:26:47 +02:00
Patrik Nordwall
8ab02738b7 Merge branch 'master' into wip-sync-artery-dev-2.4.9-patriknw 2016-08-23 20:14:15 +02:00
Patrik Nordwall
0aca351d81 harden SurviveNetworkInstabilitySpec #18767 2016-08-23 17:51:57 +02:00
Endre Sándor Varga
5e830323f6 Updating to ScalaTest 3.0.0 and ScalaCheck 1.13.2 2016-08-22 11:13:49 +02:00
Patrik Nordwall
0c4d4c37ba cluster singleton improvements, #20942
* track nodes by UniqueAddress in Cluster Singleton, #20942
* reply with HandOverDone from new incarnation, #20942
* confirm as terminated immediately when new incarnation joins, #20942 instead of waiting for failure detector to mark it as unreachable this will speed-up removal when restarting cluster node with same hostname:port
2016-08-19 11:56:55 +02:00
Patrik Nordwall
d5f84d4ad8 Merge pull request #21141 from akka/wip-21053-NodeChurnSpec-patriknw
harden NodeChurnSpec, #21053
2016-08-10 14:22:30 +02:00
Patrik Nordwall
d731f20bf1 suppress deadletter for the cluster joining messages 2016-08-09 17:22:31 +02:00
Patrik Nordwall
483d46ddd0 harden NodeChurnSpec, #21053 2016-08-08 17:50:24 +02:00
Johan Andrén
d6c048f59a A simpler ActorRefProvider config #20649 (#20767)
* Provide shorter aliases for the ActorRefProviders #20649
* Use the new actorefprovider aliases throughout code and docs
* Cleaner alias replacement logic
2016-06-10 15:04:13 +02:00
Johan Andrén
896ea53dd3 recovery timeout for persistent actors #20698 2016-06-03 14:17:41 +02:00
Patrik Nordwall
3465a221f0 format with new Scalariform version
* and fix mima issue
2016-06-03 12:56:49 +02:00
Patrik Nordwall
839ec5f167 Merge branch 'master' into wip-sync-artery-patriknw 2016-06-03 11:09:17 +02:00
Patrik Nordwall
c15e04e051 Merge pull request #20700 from akka/wip-20639-restarting-node2-patriknw
test for restarting node, #20639
2016-06-03 09:27:15 +02:00
Björn Antonsson
c66ce62d63 Update to a working version of Scalariform 2016-06-02 22:12:36 +02:00
Patrik Nordwall
91c8e90f82 test for restarting node, #20639 2016-06-02 12:52:55 +02:00
Yegor Andreenko
c66e3a9f02 =clu #20613 logging selfRoles during node unreachable and quarantined (#20542) 2016-05-24 14:35:50 +02:00
Patrik Nordwall
c22d13d3a0 Merge pull request #20541 from akka/wip-merge-2.4.5-artery-patriknw
merge master (2.4.5) into artery-dev
2016-05-18 10:34:57 +02:00
Johan Andrén
5e3eb4bd8c Auto port selection and SunnyWeatherSpec for Artery (#20512)
* Automatic port selection when port 0 configured
* Combine remoting and artery SunnyWeatherSpec
* Default to port 0 for artery in MultiNodeSpec.nodeConfig
2016-05-17 14:17:21 +02:00
2beaucoup
bc7cd17bee =htc Various minor cleanups (#20451)
* minor fixes

* remove now superfluous buffer from MultipartUnmarshaller

* remove unused TokenSourceActor

* remove FIXME: add tests, see #16437

* removed unused param remoteAddress (comment: TODO: remove after #16168 is cleared)

* convert FIXME to TODO (#18709)

* reenable tests in {Request|Response}RendererSpec due to fixed #15981

* remove logging workaround in StreamTestDefaultMailbox due to fixed #15947
2016-05-06 10:32:06 +02:00
Andrea Peruffo
088bf1b842 =act Locale unaware method in Helpers. (#20412) 2016-04-28 15:32:46 +02:00
Johan Andrén
5671927cf1 clu #20309 API for pluggable cluster downing 2016-04-18 15:06:05 +02:00
Patrik Nordwall
9f659cf9b1 remove JUnitRunner annotation, #16112
* it was used for running tests from inside Eclipse,

  but since it caused some trouble we remove it
2016-04-05 17:06:58 +02:00
Patrik Nordwall
52de0bcaa4 Clarify system name requirement for cluster members
* Clarify system name requirement for cluster members

* Recommend againsts auto-down, stronger
2016-04-04 12:37:12 +02:00
Patrik Nordwall
0ec6bd35da fix wrong setting in AdaptiveLoadBalancingRouterSpec, #18156 2016-03-22 15:31:27 +01:00
Patrik Nordwall
12db887ebb Merge pull request #20106 from akka/wip-19536-NodeChurnSpec-patriknw
harden cluster.NodeChurnSpec, #19536
2016-03-22 15:12:14 +01:00
Patrik Nordwall
3e7cd4d98c Merge pull request #20093 from akka/wip-19780-ack-takeover-patriknw
rem #19780: Skip acks during connection handoff
2016-03-22 14:01:29 +01:00