Detect failure when no heartbeats sent, see #2907

* Subscribe to InstantMemberEvent and start heartbeating when
  InstantMemberUp. Same for metrics.
* HeartbeatNodeRing data structure for bidirectional mapping of
  heartbeat sender and receiver. Not using ConsistentHash anymore.
  Node addresses are hashed to ensure that neighbors are spread out.
* HeartbeatRequest when receiver detects that it has not received
  expected heartbeats.
* New test InitialHeartbeatSpec that simulates the problem
* Add/remove some related conf properties
* Add some more logging to be able to diagnose eventual problems
* Explicit config of nr-of-end-heartbeats
This commit is contained in:
Patrik Nordwall 2013-01-15 09:35:07 +01:00
parent c5685a0855
commit 8b4e903e7d
25 changed files with 466 additions and 146 deletions

View file

@ -110,12 +110,12 @@ abstract class LeaderElectionSpec(multiNodeConfig: LeaderElectionMultiNodeConfig
}
}
"be able to 're-elect' a single leader after leader has left" taggedAs LongRunningTest in within(20 seconds) {
"be able to 're-elect' a single leader after leader has left" taggedAs LongRunningTest in within(30 seconds) {
shutdownLeaderAndVerifyNewLeader(alreadyShutdown = 0)
enterBarrier("after-2")
}
"be able to 're-elect' a single leader after leader has left (again)" taggedAs LongRunningTest in within(20 seconds) {
"be able to 're-elect' a single leader after leader has left (again)" taggedAs LongRunningTest in within(30 seconds) {
shutdownLeaderAndVerifyNewLeader(alreadyShutdown = 1)
enterBarrier("after-3")
}