Skip to content

test: stabilize flaky TestResourceManagerClientTestSuite/standalone-resource-manager-with-client-discovery/TestBasicResourceGroupCURD#10935

Open
flaky-claw wants to merge 1 commit into
tikv:masterfrom
flaky-claw:flakyfixer/case_cc8428ab3e74-a1
Open

test: stabilize flaky TestResourceManagerClientTestSuite/standalone-resource-manager-with-client-discovery/TestBasicResourceGroupCURD#10935
flaky-claw wants to merge 1 commit into
tikv:masterfrom
flaky-claw:flakyfixer/case_cc8428ab3e74-a1

Conversation

@flaky-claw

@flaky-claw flaky-claw commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: close #10934

Problem Summary:
Flaky test TestResourceManagerClientTestSuite/standalone-resource-manager-with-client-discovery/TestBasicResourceGroupCURD in github.com/tikv/pd/tests/integrations/mcs/resourcemanager intermittently fails, so this PR stabilizes that path.

What changed and how does it work?

Root Cause

The test assumed synchronous standalone RM cache visibility after PD-handled resource group metadata writes/deletes.

Fix

Waiting for the cache to reflect accepted modify/delete operations matches the actual contract while preserving exact final assertions.

Verification

Spec:

  • target: github.com/tikv/pd/tests/integrations/mcs/resourcemanager :: TestResourceManagerClientTestSuite/standalone-resource-manager-with-client-discovery/TestBasicResourceGroupCURD
  • strategy: pd.issue_scoped.v1
  • plan mode: BASELINE_ONLY
  • requirements: required case must execute; no skip; repeat count = 1
  • baseline gates: required_flaky_gate, build_safety_gate, intent_guard_gate
  • feedback surface source: baseline_only

Observed result:

  • status: passed
  • required case executed: yes
  • submission decision: ALLOWED
  • note: Required flaky case executed during validation.
    Required flaky case was not skipped.
    target-flaky-corrected passed.
    build passed.

Gate checklist:

  • target-flaky-corrected: PASS
  • prompt-target-path: SKIPPED
  • build: PASS

Commands:

  • cd tests/integrations && go test -json -tags without_dashboard ./mcs/resourcemanager -run 'TestResourceManagerClientTestSuite/standalone-resource-manager-with-client-discovery/TestBasicResourceGroupCURD' -count=1
  • make build

Release note

None

Fixes #10934

Summary by CodeRabbit

  • Tests
    • Improved reliability of resource group tests with enhanced retry logic for asynchronous operations validation and error handling.

@ti-chi-bot ti-chi-bot Bot added the release-note-none Denotes a PR that doesn't merit a release note. label Jun 23, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign husharp for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added contribution This PR is from a community contributor. dco-signoff: no Indicates the PR's author has not signed dco. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Jun 23, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Hi @flaky-claw. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot

ti-chi-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Thanks for your pull request. Before we can look at it, you'll need to add a 'DCO signoff' to your commits.

📝 Please follow instructions in the contributing guide to update your commits with the DCO

Full details of the Developer Certificate of Origin can be found at developercertificate.org.

The list of commits missing DCO signoff:

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ti-chi-bot ti-chi-bot Bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jun 23, 2026
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3abc8f8f-11bf-4ac1-861e-3e25bcae89d3

📥 Commits

Reviewing files that changed from the base of the PR and between 17e9771 and 31f8e69.

📒 Files selected for processing (1)
  • tests/integrations/mcs/resourcemanager/resource_manager_test.go

📝 Walkthrough

Walkthrough

Two immediate assertions in the gRPC resource group CRUD test are replaced with Eventually-based retry loops. The post-modify read now uses reflect.DeepEqual to wait for the expected state, and the post-delete error check computes an expectedErr string and retries until the error matches. The reflect package import is added accordingly.

Changes

Fix flaky gRPC resource group CRUD test assertions

Layer / File(s) Summary
Convert immediate CRUD assertions to Eventually retries
tests/integrations/mcs/resourcemanager/resource_manager_test.go
Adds reflect import; replaces immediate GetResourceGroup + re.NoError assertion after modify with an Eventually loop using reflect.DeepEqual; replaces immediate post-delete error assertion with a computed expectedErr string and an Eventually loop that retries until the returned error matches.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

  • tikv/pd#10853: Both PRs fix flaky CRUD-delete verification by replacing immediate post-operation assertions with testutil.Eventually retry loops that wait for the expected not-found/error state.

Suggested labels

size/XS, lgtm, approved, dco-signoff: yes

Suggested reviewers

  • rleungx
  • okJiang
  • lhy1024

Poem

🐰 No more flaky hops through the test,
Eventually I'll wait for the best!
With reflect.DeepEqual in paw,
I retry 'til there's no flaw.
The CRUD groups line up just right — hooray! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: stabilizing a specific flaky test case through implementation details.
Description check ✅ Passed The description covers problem statement, root cause analysis, fix implementation, and verification results, mostly following the template structure.
Linked Issues check ✅ Passed The PR directly addresses issue #10934 by adding explicit waits for cache synchronization to resolve timing/race conditions causing intermittent test failures.
Out of Scope Changes check ✅ Passed All changes are within scope: the test file modifications target the specific flaky test case mentioned in #10934 to address cache visibility and race condition issues.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79.20%. Comparing base (17e9771) to head (31f8e69).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #10935   +/-   ##
=======================================
  Coverage   79.19%   79.20%           
=======================================
  Files         540      540           
  Lines       74847    74847           
=======================================
+ Hits        59276    59283    +7     
+ Misses      11394    11392    -2     
+ Partials     4177     4172    -5     
Flag Coverage Δ
unittests 79.20% <ø> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contribution This PR is from a community contributor. dco-signoff: no Indicates the PR's author has not signed dco. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TestBasicResourceGroupCURD is flaky

1 participant