Commit graph

108 commits

Author SHA1 Message Date
Patrik Nordwall
bf6cd6d2c7 harden another case in ClusterShardingSpec, #23006
* due to the new StartEntity message the start is not
  as instant as it used to be and therefore the test must
  retry this check
2017-05-23 07:24:04 +02:00
Patrik Nordwall
99a044b472 fix remember entities tests, #22994 2017-05-22 14:58:41 +02:00
Johan Andrén
86aa42cf6c remember entities and changing shardIdExtractor (#22894)
* Test case covering changing shard id extractor with remember-entities

* This should do the trick

* Feedback addressed

* Docs and migration guide mention

* Correct logic to persist that entity has moved off off shard
2017-05-22 10:08:18 +02:00
Patrik Nordwall
32e6a59363 Start shards after full cluster restart, #22868
* when using remember entities with ddata mode the set of
  shards were not saved in durable storage and therefore the
  remembered entities were not loaded until the first message
  was sent to the shard
* the coordinator stores the set of shards in a durable GSet
* loaded when the coordinator is started and added to the State,
  rest is already taken care of via the unallocatedShards Set in
  the State
* when new shards are allocated the durable GSet is updated if it
  doesn't already contain the shard identifier
2017-05-19 14:47:35 +02:00
Patrik Nordwall
3ab101039f Lazy init of LmdbDurableStore, #22759 (#22779)
* Lazy init of LmdbDurableStore, #22759

* to avoid creating files (and initializing db) when not needed,
  e.g. cluster sharding that is not using remember entities
* enable MiMa against 2.5.0

* use OptionVal instead
2017-04-28 15:12:14 +02:00
Patrik Nordwall
b45a254685 use minCap for majority write/read in sharding, #22141
* also added some docs about the feature since that was missing
2017-01-24 16:41:18 +01:00
Patrik Nordwall
37679d307e rememberingEntities with ddata mode, #22154
* one Replicator per configured role
* log LMDB directory at startup
* clarify the imporantce of the LMDB directory
* use more than one key to support many entities
2017-01-23 11:57:52 +01:00
Patrik Nordwall
452b3f1406 remove old deprecated cluster metrics, #21423
* corresponding was moved to akka-cluster-metrics, see
  http://doc.akka.io/docs/akka/2.4/project/migration-guide-2.3.x-2.4.x.html#New_Cluster_Metrics_Extension
2017-01-20 13:48:36 +01:00
Patrik Nordwall
84ade6fdc3 add CoordinatedShutdown, #21537
* CoordinatedShutdown that can run tasks for configured phases in order (DAG)
* coordinate handover/shutdown of singleton with cluster exiting/shutdown
* phase config obj with depends-on list
* integrate graceful leaving of sharding in coordinated shutdown
* add timeout and recover
* add some missing artery ports to tests
* leave via CoordinatedShutdown.run
* optionally exit-jvm in last phase
* run via jvm shutdown hook
* send ExitingConfirmed to leader before shutdown of Exiting
  to not have to wait for failure detector to mark it as
  unreachable before removing
* the unreachable signal is still kept as a safe guard if
  message is lost or leader dies
* PhaseClusterExiting vs MemberExited in ClusterSingletonManager
* terminate ActorSystem when cluster shutdown (via Down)
* add more predefined and custom phases
* reference documentation
* migration guide
* problem when the leader order was sys2, sys1, sys3,
  then sys3 could not perform it's duties and move Leving sys1 to
  Exiting because it was observing sys1 as unreachable
* exclude Leaving with exitingConfirmed from convergence condidtion
2017-01-16 09:01:57 +01:00
Philippus Baalman
6c7085252a extended copyright into 2017 2017-01-04 17:37:15 +01:00
Patrik Nordwall
d0053746df fix regression in remember entities, #21892
* regression was introduced by 141318e60a
  in 2.4.12
2016-11-28 14:30:34 +01:00
Patrik Nordwall
48e85953d9 harden ClusterShardingSpec, #21535 2016-11-08 14:01:23 +01:00
Patrik Nordwall
1bb8f1737f increase barrier-timeout in ClusterShardingSpec, #21718
* In the logs of the failing test we can see that the first node is removed
  as expected and then come back in the membership, which is possible in
  case of conflicting membership state merge. It is supposed to be
  removed again by the auto-down. That doesn't happen within the barrier-timeout.
2016-11-08 13:37:37 +01:00
Patrik Nordwall
141318e60a shard coordinator should wait until min-members regions registered, #21194 2016-10-28 15:49:21 +02:00
Patrik Nordwall
8ab02738b7 Merge branch 'master' into wip-sync-artery-dev-2.4.9-patriknw 2016-08-23 20:14:15 +02:00
Peter Barron
1f9c374bd9 Cluster Sharding with remember-entity enabled fails to recover after restart #20744 2016-08-01 10:46:09 +02:00
Johan Andrén
d6c048f59a A simpler ActorRefProvider config #20649 (#20767)
* Provide shorter aliases for the ActorRefProviders #20649
* Use the new actorefprovider aliases throughout code and docs
* Cleaner alias replacement logic
2016-06-10 15:04:13 +02:00
Björn Antonsson
c66ce62d63 Update to a working version of Scalariform 2016-06-02 22:12:36 +02:00
Patrik Nordwall
9f32b77bde increase timeout for setting up SharedLeveldbStore, #20056
* SharedLeveldbStore is opening leveldb in preStart so that might
  sometimes take more than 3 seconds, I guess
* the test looks correct
2016-05-10 15:02:52 +02:00
Roland Kuhn
7cf99134dc catch ActorCell creation failures for top-level actors #15947
Previously a failure during e.g. MailboxType.create() would make the
user guardian fail, tearing down the whole system as a result. The cause
is a deep bug in handling ActorCell creation that we cannot really fix
anymore due to resulting changes in semantics, hence this fix only
targets top-level actors (where the observable difference is an
unambiguous improvement).

fixes #15947
2016-03-17 11:04:52 +01:00
Johan Andrén
854d5b0c09 Stabilization of ClusterShardingGetStatsSpec, fix for #19863 2016-03-09 16:24:50 +01:00
James Mulcahy
48ecd9d7d5 Fix for sharding GetClusterStats #19601 2016-02-23 17:12:17 +01:00
Johan Andrén
62e30b3c08 Update copyrights and links to the new company name #19851 2016-02-23 12:58:39 +01:00
Tal Pressman
4c12b5ea50 =clu #19622 Use full address in ClusterShardingStats 2016-02-02 16:51:19 +02:00
Prayag Verma
b7783968a0 =pro #19068 All copyrights ranges and single years updated to a range ending in 2016 2016-01-25 10:20:30 +01:00
Patrik Nordwall
5ebdd79bee =cls increase the delay in the graceful shutdown example 2015-12-21 09:54:14 +01:00
Patrik Nordwall
f5ed085179 =cls improve the graceful shutdown example 2015-12-18 11:39:52 +01:00
Patrik Nordwall
a6fd7b448f =cls #18978 Lazy startup of shards when rememberEntities=false
* and don't populate the unallocatedShards Set in the State
  when rememberEntities=false
2015-11-27 10:09:44 +01:00
Patrik Nordwall
27995af79f =cls #18722 fix DDataShardCoordinator init
* the become logic was wrong when watchStateActors triggers an immediate
  state update
2015-11-18 16:13:58 +01:00
Krzysztof Bochenek
5c418efef2 =cls #18762 fix graceful shutdown of empty region 2015-11-11 11:58:43 +01:00
Johan Andrén
4abbc8db50 +clu #17695 add a way to inspect the current sharding state
Two new message pairs:
`GetShardRegionState`/`CurrentShardRegionState` allows for querying a region for it's current shards and the current `EntityIds` of it
`GetClusterShardingStats`/`ClusterShardingStats` allows for querying the entire cluster for a summary of
the number of entitites alive in each region and shard.
2015-11-02 08:56:09 +01:00
Konrad Malawski
c57b4e24c8 Merge pull request #18445 from akka/wip-18370-sharding-supervision-patriknw
=cls #18370 Document supervision for Cluster Sharding
2015-09-16 12:59:37 +02:00
Patrik Nordwall
4e2b8190a3 =cls #18370 Document supervision for Cluster Sharding 2015-09-10 15:35:26 +02:00
Patrik Nordwall
e5159eb764 =cls #18176 Harden ClusterShardingLeavingSpec
In logs it is clear that the fourth node is moved to Up,
but it takes more than 5 sec to disseminate that info
2015-09-09 14:36:08 +02:00
Patrik Nordwall
c9662d8083 Merge pull request #18324 from akka/wip-15646-sharding-initial-watch-patriknw
=cls #15646 Optimize the initial watch in shard coordinator
2015-09-04 12:02:17 +02:00
Patrik Nordwall
bfde1eff19 =clu #18337 Disable down-removal-margin by default
For manual downing it is not needed. For auto-down it doesn't add any extra safety, since that
is not handling network partitions anyway.

The setting is still useful if you implement downing strategies that handle network partitions,
e.g. by keeping the larger side of the partition and shutting down the smaller side.
2015-09-04 11:28:33 +02:00
Patrik Nordwall
bc48872873 =cls #15646 Optimize the initial watch in shard coordinator
Two improvements to the coordinator startup (state recovery) that
should make it operational faster and reduce the amount of lost messages
during startup.

* Let the quick (those not involving failure detection) Terminated messages
  be processed before starting to reply to GetShardHome.
* Consider regions that don't belong to the current cluster
  to be terminated.
2015-08-27 18:45:32 +02:00
Ostapenko Evgeniy
6814d08ef1 =cls #17846 Use CRDTs instead of PersistentActor to remember the state of the ShardCoordinator #17871 2015-08-20 13:36:37 +03:00
Konrad Malawski
86c00d4716 !per +act #17842 move BackoffSupervisor to akka.pattern 2015-07-08 16:45:23 +02:00
Patrik Nordwall
89f17ddfd0 =cls #17447 Split Cluster Sharding docs into java/scala 2015-06-30 16:39:31 +02:00
Patrik Nordwall
2832dd55c5 !clt, cls #17866 Use systemActorOf for exension actors
* ClusterSharding
* ClusterClientReceptionist
* dispatcher config, since deployment config can't be used
  for system actors
2015-06-30 16:37:34 +02:00
Roland Kuhn
0de9f0ff40 Merge pull request #17641 from kukido/kukido-spellings-normalization
=doc #17329 Fixed and normalized spellings in ScalaDoc and comments
2015-06-19 12:06:53 +02:00
Patrik Nordwall
2a88f4fb29 =clu Improve cluster downing
* avoid using Down and Exiting member from being used for joining
* delay shut down of Down member until the information is spread
  to all reachable members, e.g. downing several nodes via one node
* akka.cluster.down-removal-margin setting
  Margin until shards or singletons that belonged to a
  downed/removed partition are created in surviving partition.
  Used by singleton and sharding.
* remove the retry count parameters/settings for singleton in
  favor of deriving those from the removal-margin
2015-06-18 12:55:54 +02:00
Patrik Nordwall
6d26b3e591 !per Make persistent failures fatal
* remove PersistentFailure and RecoveryFailure messages
* use stop instead of ActorKilledException
* adjust PersistentView
* adjust AtLeastOnceDeliveryFailureSpec
* adjust sharding
* add BackoffSupervisor
2015-06-17 15:49:47 +02:00
Patrik Nordwall
5fab2b4521 !cls #16422 Rename shardResolver and idExtractor 2015-06-16 13:38:57 +02:00
Patrik Nordwall
70024298ac !cls #16422 Rename Entry to Entity in sharding 2015-06-11 10:00:43 +02:00
Patrik Nordwall
25ba89a98b =cls #15614 Change persistenceId for sharding coordinator 2015-06-11 10:00:43 +02:00
Patrik Nordwall
8276420a89 =cls #15619 Use event counting instead of time based snapshot in sharding 2015-06-11 10:00:42 +02:00
Patrik Nordwall
c9a2447867 +cls #15330 Add GetCurrentRegions, for testing 2015-06-11 10:00:42 +02:00
Patrik Nordwall
294659e2fe =cls #15330 Enable configuration of coordinator singleton 2015-06-11 10:00:42 +02:00