=str #17200 Stop shard region when MemberRemoved
Two issues: 1) ShardRegion actor must stop itself when the node is shutting down, ie. when receiving MemberRemoved(selfAddress) 2) ShardCoordinator must not persist anything when the node is shutting down. MemberRemoved of other shard regions will trigger Terminated, which must not be persisted, because then the next coordinator will replay those events and end up in wrong state. This is a problem announced itself when using leaving as illustrated in the new test. To solve the second issue I have added a new ClusterShuttingDown event that is published before the MemberRemoved events. Note that Terminated is triggered by MemberRemoved. (cherry picked from commit 1b272c72597beece9d93f0054f4b58e3d25f9ae2)
This commit is contained in:
parent
db8d02ff06
commit
c991d5f1d1
7 changed files with 244 additions and 5 deletions
|
|
@ -171,6 +171,17 @@ object ClusterEvent {
|
|||
def getLeader: Address = leader orNull
|
||||
}
|
||||
|
||||
/**
|
||||
* This event is published when the cluster node is shutting down,
|
||||
* before the final [[MemberRemoved]] events are published.
|
||||
*/
|
||||
final case object ClusterShuttingDown extends ClusterDomainEvent
|
||||
|
||||
/**
|
||||
* Java API: get the singleton instance of [[ClusterShuttingDown]] event
|
||||
*/
|
||||
def getClusterShuttingDownInstance = ClusterShuttingDown
|
||||
|
||||
/**
|
||||
* Marker interface to facilitate subscription of
|
||||
* both [[UnreachableMember]] and [[ReachableMember]].
|
||||
|
|
@ -328,6 +339,7 @@ private[cluster] final class ClusterDomainEventPublisher extends Actor with Acto
|
|||
|
||||
override def postStop(): Unit = {
|
||||
// publish the final removed state before shutting down
|
||||
publish(ClusterShuttingDown)
|
||||
publishChanges(Gossip.empty)
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue