config: persist default store limit for future stores#10900
Conversation
|
Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThis PR persists schedule-level default store limits, updates store-limit read/write paths to use them, and adds tests for reload, restart, and future-store initialization behavior. ChangesPersisted default store limit flow
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Hi @King-Dylan. Thanks for your PR. I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
b2c0925 to
62c341f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/schedule/config/config.go`:
- Around line 633-641: The GetDefaultStoreLimit method treats any zero value as
unset and replaces it with process defaults, which breaks persistence of
explicitly configured zero values across restart. Replace the zero-value checks
for AddPeer and RemovePeer fields with a proper mechanism to distinguish between
explicitly set zero values and truly unset values (such as using pointers to the
fields or a different storage representation), so that persisted zero values are
respected and not overwritten by process defaults.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: cfc073c5-d240-4769-8674-03ba34dcad72
📒 Files selected for processing (7)
pkg/mcs/scheduling/server/config/config.gopkg/mcs/scheduling/server/config/config_test.gopkg/schedule/config/config.goserver/cluster/cluster.goserver/cluster/cluster_test.goserver/config/config_test.goserver/config/persist_options.go
62c341f to
5e152da
Compare
Close tikv#10898 Signed-off-by: King-Dylan <702299521@qq.com>
5e152da to
6f224a3
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/schedule/config/config.go`:
- Around line 472-475: The migration logic in config.go currently lets
StoreBalanceRate overwrite an explicitly configured
DefaultStoreLimit/DefaultStoreLimitConfig, which clobbers user-specified
add-peer/remove-peer values. Update the backfill path around StoreBalanceRate
handling so it only populates DefaultStoreLimit when the corresponding
default-store-limit fields are unset, or add explicit validation to reject mixed
configuration; use the existing config migration logic in the affected blocks
near the StoreBalanceRate assignments and the
DefaultStoreLimit/DefaultStoreLimitConfig fields to preserve explicit values.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2749e8c6-be72-4a19-b949-c6a6e8ff6353
📒 Files selected for processing (7)
pkg/mcs/scheduling/server/config/config.gopkg/mcs/scheduling/server/config/config_test.gopkg/schedule/config/config.goserver/cluster/cluster.goserver/cluster/cluster_test.goserver/config/config_test.goserver/config/persist_options.go
🚧 Files skipped from review as they are similar to previous changes (4)
- server/config/persist_options.go
- pkg/mcs/scheduling/server/config/config.go
- server/cluster/cluster.go
- server/cluster/cluster_test.go
| if c.StoreBalanceRate != 0 { | ||
| DefaultStoreLimit = StoreLimit{AddPeer: c.StoreBalanceRate, RemovePeer: c.StoreBalanceRate} | ||
| c.DefaultStoreLimit = StoreLimitConfig{AddPeer: c.StoreBalanceRate, RemovePeer: c.StoreBalanceRate} | ||
| c.StoreBalanceRate = 0 |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Don't let store-balance-rate clobber explicit default-store-limit.
If both are present during migration, these assignments overwrite user-specified default-store-limit.add-peer/remove-peer with the deprecated symmetric rate. That makes the new persisted default impossible to roll out incrementally and loses the more specific config. Only backfill from StoreBalanceRate when the corresponding default-store-limit field was not set, or reject the mixed configuration explicitly.
Also applies to: 553-556
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/schedule/config/config.go` around lines 472 - 475, The migration logic in
config.go currently lets StoreBalanceRate overwrite an explicitly configured
DefaultStoreLimit/DefaultStoreLimitConfig, which clobbers user-specified
add-peer/remove-peer values. Update the backfill path around StoreBalanceRate
handling so it only populates DefaultStoreLimit when the corresponding
default-store-limit fields are unset, or add explicit validation to reject mixed
configuration; use the existing config migration logic in the affected blocks
near the StoreBalanceRate assignments and the
DefaultStoreLimit/DefaultStoreLimitConfig fields to preserve explicit values.
What problem does this PR solve?
Issue Number: close #10898
Persist the default store limit used for future stores. Today
/pd/api/v1/stores/limitupdates existing stores and mutates the process-local default store limit, but the default for future stores is not persisted. After PD/API/scheduling service restart or leader/primary switch, newly joined TiKV stores can fall back to the built-in default value.What is changed and how does it work?
schedule.default-store-limittoScheduleConfig.SetAllStoresLimitto persist the default limit as well as updating existing per-store entries.schedule.default-store-limitbefore falling back to the process default.schedule.store-limit[storeID]as the highest-priority override.PersistConfig.The resulting priority is:
schedule.store-limit[storeID]schedule.default-store-limitCheck List
Tests
Also attempted the existing MCS integration store-limit test:
It failed during build because dashboard embedded assets were not generated in this checkout:
Code changes
Side effects
Summary by CodeRabbit
Release Notes
store-balance-rateis now reflected into per-schedule default store limits to prevent unintended downstream changes.