Commit graph

86 commits

Author SHA1 Message Date
kerr
fafc59b19d update headers to regular comment (#25807) 2018-10-29 05:19:37 -04:00
Saleh Khazaei
176b718b2a Adding maximum restart attempts to BackoffSupervisor #24769 2018-09-14 14:22:52 +02:00
kenji yoshida
5b3b191bac Remove procedure syntax (#25362) 2018-07-25 13:38:27 +02:00
Christopher Batey
28b86379c8 Harden MultiDcClusterShardingSpec (#25201)
- Use global multi node cluster config
- Reduce retry interval for ShardRegion register
- Add clue to unhelpful assert failing
2018-06-15 15:28:04 +02:00
Christopher Batey
01f90ad95d
Add common multi node cluster config to all cluster sharding tests (#25202) 2018-06-05 06:58:17 +01:00
Christopher Batey
5bde26dca6 Fix ClusterShardingIncorrectSetup barrier (#25187)
Barrier needs to be one line down otherwise can fail:
http://jenkins.akka.io:8498/job/akka-artery-cluster-tests/1364/consoleFulllusterShardingIncorrectSetupMultiJvmNode1
2018-06-04 15:04:18 +02:00
Christopher Batey
d03f21a35a
Suggest ClusterSharding hasn't been started in log message (#25177) 2018-05-31 14:43:06 +01:00
Christopher Batey
23373565db
Fix typed cluster singleton cross dc proxies (#24936)
* Fix typed cluster singleton cross dc proxies
* Adds first multi-jvm test for typed cluster
2018-04-27 12:44:44 +01:00
Christopher Batey
a3e52078df Enable header plugin for the MultiJVM configuration (#24974)
Seems when did the changes for 2018 it intro introduced a space in all
after, hence so many changes.
2018-04-25 00:03:55 +09:00
Christopher Batey
4d20b2a660 Reduce size of jenkins logs
Each build is now over 40mb logs.

A lot of DEBUG logging was left on for test failures that have been
fixed. Added an issue # for ones that are still valid or if if it on
as the test verifies debug
2018-04-24 08:49:41 +01:00
Patrik Nordwall
2cd1187e7b entityId => Behavior in ClusterSharding API, #24470
* spawn with String => Behavior since the entityId is often needed
* some type inference is lost, and completely breaks down with overloads
2018-02-02 08:43:11 +01:00
Patrik Nordwall
d30464c452 Reply to GetShardHome requests after rebalance, #24191
* Some GetShardHome requests were ignored (by design) during
  rebalance and they would be retried later.
* This optimization keeps tracks of such requests and reply
  to them immediately after rebalance has been completed and
  thereby the buffered messages in the region don't have to
  wait for next retry tick.
* use regionTerminationInProgress also during the update since
  all GetShardHome requests are not stashed
2018-01-09 20:12:45 +01:00
Patrik Nordwall
0eedd714e8 reply to known shard locations immediately when waitingForUpdate, #24064 2018-01-09 13:51:35 +01:00
Christopher Batey
009214ae07
Update copyright to 2018 (#24241) 2018-01-04 17:26:29 +00:00
Johan Andrén
582f6a4836
Revert source incompatible sharding changes (#24126)
* Revert "fix entityPropsFactory id param, #21809"
This reverts commit cd7eae28f6.
* Revert "Merge pull request #24058 from talpr/talpr-24053-add-entity-id-to-sharding-props"
This reverts commit 8417e70460, reversing
changes made to 22e85f869d.
2017-12-07 17:49:29 +01:00
Patrik Nordwall
cd7eae28f6 fix entityPropsFactory id param, #21809 2017-12-07 13:17:04 +01:00
Patrik Nordwall
8417e70460
Merge pull request #24058 from talpr/talpr-24053-add-entity-id-to-sharding-props
Add entity id to sharding props (#24053)
2017-12-06 07:35:28 +01:00
Tal Pressman
a8e5f48f36 add entity id to sharding props (#24053) 2017-12-05 16:49:05 +02:00
Christopher Batey
76b2cfa676 Add common multi jvm config to cluster tests (#23974) 2017-12-04 15:23:55 +01:00
Christopher Batey
99511a0027 Fix race in ClusterShardingFailureSpec
AFAICT there was nothing ensuring the order of messages when sent to the
shard and the region so first checkthat the passivation has happened
before sending another add in the test

Refs #24013
2017-11-20 16:35:47 +00:00
Christopher Batey
1eb3abb27e
Fix lookup of coordinator for sharding proxies (#23995) 2017-11-15 13:03:48 +00:00
Christopher Batey
83a97256cc Turn on gossip logging for flaky test + improve test error msg (#23868) 2017-10-30 14:52:58 +01:00
Patrik Nordwall
6bfb7c9262 increase timeout in MultiDcSplitBrainSpec
* due to handshake timeout

reduce handshake timeout

fourth might generate UnreachableDataCenter in unsplit

MultiDcClusterSharding
2017-08-31 10:26:23 +02:00
Johan Andrén
9c7e8d027a Renamed/moved the self data center setting #23312 (#23344) 2017-07-12 11:47:32 +01:00
Patrik Nordwall
87d74f1510 Docs for multi-DC features 2017-07-07 16:55:22 +02:00
Patrik Nordwall
bb9549263e Rename team to data center, #23275 2017-07-04 17:11:21 +02:00
Patrik Nordwall
e0fe0bc49e Make cluster sharding DC aware, #23231
* Sharding only within own team (coordinator is singleton)
* the ddata Replicator used by Sharding must also be only within own team
* added support for Set of roles in ddata Replicator so that can be used
  by sharding to specify role + team
* Sharding proxy can route to sharding in another team
2017-07-04 15:04:43 +02:00
Patrik Nordwall
9835c08779 harden ClusterShardingRememberEntitiesSpecNewExtractorSpec 2017-05-23 13:44:24 +02:00
Patrik Nordwall
bf6cd6d2c7 harden another case in ClusterShardingSpec, #23006
* due to the new StartEntity message the start is not
  as instant as it used to be and therefore the test must
  retry this check
2017-05-23 07:24:04 +02:00
Patrik Nordwall
99a044b472 fix remember entities tests, #22994 2017-05-22 14:58:41 +02:00
Johan Andrén
86aa42cf6c remember entities and changing shardIdExtractor (#22894)
* Test case covering changing shard id extractor with remember-entities

* This should do the trick

* Feedback addressed

* Docs and migration guide mention

* Correct logic to persist that entity has moved off off shard
2017-05-22 10:08:18 +02:00
Patrik Nordwall
32e6a59363 Start shards after full cluster restart, #22868
* when using remember entities with ddata mode the set of
  shards were not saved in durable storage and therefore the
  remembered entities were not loaded until the first message
  was sent to the shard
* the coordinator stores the set of shards in a durable GSet
* loaded when the coordinator is started and added to the State,
  rest is already taken care of via the unallocatedShards Set in
  the State
* when new shards are allocated the durable GSet is updated if it
  doesn't already contain the shard identifier
2017-05-19 14:47:35 +02:00
Patrik Nordwall
3ab101039f Lazy init of LmdbDurableStore, #22759 (#22779)
* Lazy init of LmdbDurableStore, #22759

* to avoid creating files (and initializing db) when not needed,
  e.g. cluster sharding that is not using remember entities
* enable MiMa against 2.5.0

* use OptionVal instead
2017-04-28 15:12:14 +02:00
Patrik Nordwall
b45a254685 use minCap for majority write/read in sharding, #22141
* also added some docs about the feature since that was missing
2017-01-24 16:41:18 +01:00
Patrik Nordwall
37679d307e rememberingEntities with ddata mode, #22154
* one Replicator per configured role
* log LMDB directory at startup
* clarify the imporantce of the LMDB directory
* use more than one key to support many entities
2017-01-23 11:57:52 +01:00
Patrik Nordwall
452b3f1406 remove old deprecated cluster metrics, #21423
* corresponding was moved to akka-cluster-metrics, see
  http://doc.akka.io/docs/akka/2.4/project/migration-guide-2.3.x-2.4.x.html#New_Cluster_Metrics_Extension
2017-01-20 13:48:36 +01:00
Patrik Nordwall
84ade6fdc3 add CoordinatedShutdown, #21537
* CoordinatedShutdown that can run tasks for configured phases in order (DAG)
* coordinate handover/shutdown of singleton with cluster exiting/shutdown
* phase config obj with depends-on list
* integrate graceful leaving of sharding in coordinated shutdown
* add timeout and recover
* add some missing artery ports to tests
* leave via CoordinatedShutdown.run
* optionally exit-jvm in last phase
* run via jvm shutdown hook
* send ExitingConfirmed to leader before shutdown of Exiting
  to not have to wait for failure detector to mark it as
  unreachable before removing
* the unreachable signal is still kept as a safe guard if
  message is lost or leader dies
* PhaseClusterExiting vs MemberExited in ClusterSingletonManager
* terminate ActorSystem when cluster shutdown (via Down)
* add more predefined and custom phases
* reference documentation
* migration guide
* problem when the leader order was sys2, sys1, sys3,
  then sys3 could not perform it's duties and move Leving sys1 to
  Exiting because it was observing sys1 as unreachable
* exclude Leaving with exitingConfirmed from convergence condidtion
2017-01-16 09:01:57 +01:00
Philippus Baalman
6c7085252a extended copyright into 2017 2017-01-04 17:37:15 +01:00
Patrik Nordwall
d0053746df fix regression in remember entities, #21892
* regression was introduced by 141318e60a
  in 2.4.12
2016-11-28 14:30:34 +01:00
Patrik Nordwall
48e85953d9 harden ClusterShardingSpec, #21535 2016-11-08 14:01:23 +01:00
Patrik Nordwall
1bb8f1737f increase barrier-timeout in ClusterShardingSpec, #21718
* In the logs of the failing test we can see that the first node is removed
  as expected and then come back in the membership, which is possible in
  case of conflicting membership state merge. It is supposed to be
  removed again by the auto-down. That doesn't happen within the barrier-timeout.
2016-11-08 13:37:37 +01:00
Patrik Nordwall
141318e60a shard coordinator should wait until min-members regions registered, #21194 2016-10-28 15:49:21 +02:00
Patrik Nordwall
8ab02738b7 Merge branch 'master' into wip-sync-artery-dev-2.4.9-patriknw 2016-08-23 20:14:15 +02:00
Peter Barron
1f9c374bd9 Cluster Sharding with remember-entity enabled fails to recover after restart #20744 2016-08-01 10:46:09 +02:00
Johan Andrén
d6c048f59a A simpler ActorRefProvider config #20649 (#20767)
* Provide shorter aliases for the ActorRefProviders #20649
* Use the new actorefprovider aliases throughout code and docs
* Cleaner alias replacement logic
2016-06-10 15:04:13 +02:00
Björn Antonsson
c66ce62d63 Update to a working version of Scalariform 2016-06-02 22:12:36 +02:00
Patrik Nordwall
9f32b77bde increase timeout for setting up SharedLeveldbStore, #20056
* SharedLeveldbStore is opening leveldb in preStart so that might
  sometimes take more than 3 seconds, I guess
* the test looks correct
2016-05-10 15:02:52 +02:00
Roland Kuhn
7cf99134dc catch ActorCell creation failures for top-level actors #15947
Previously a failure during e.g. MailboxType.create() would make the
user guardian fail, tearing down the whole system as a result. The cause
is a deep bug in handling ActorCell creation that we cannot really fix
anymore due to resulting changes in semantics, hence this fix only
targets top-level actors (where the observable difference is an
unambiguous improvement).

fixes #15947
2016-03-17 11:04:52 +01:00
Johan Andrén
854d5b0c09 Stabilization of ClusterShardingGetStatsSpec, fix for #19863 2016-03-09 16:24:50 +01:00
James Mulcahy
48ecd9d7d5 Fix for sharding GetClusterStats #19601 2016-02-23 17:12:17 +01:00