Skip to content

ci: parallelize merge-tree mocha tests (skip in perf mode)#26657

Open
frankmueller-msft wants to merge 2 commits intomicrosoft:mainfrom
frankmueller-msft:ci/parallel-merge-tree-tests-v2
Open

ci: parallelize merge-tree mocha tests (skip in perf mode)#26657
frankmueller-msft wants to merge 2 commits intomicrosoft:mainfrom
frankmueller-msft:ci/parallel-merge-tree-tests-v2

Conversation

@frankmueller-msft
Copy link
Contributor

@frankmueller-msft frankmueller-msft commented Mar 5, 2026

Summary

Re-applies the parallelization from #26624 (reverted in #26653) with a fix for the Performance Benchmarks pipeline failure.

  • Enables parallel: true with jobs: 4 for merge-tree mocha tests, except when running in perf mode (--perfMode)
  • Perf mode is excluded because Mocha's parallel worker serialization strips methods from Hook objects (like hook.error()), which the benchmark reporter depends on. Sequential execution is also needed for measurement quality.
  • Bumps timeouts in applyStashedOpFarm and reconnectFarm to match other farm tests

Changes

  • packages/dds/merge-tree/.mocharc.cjs — conditional parallel config
  • packages/dds/merge-tree/src/test/client.applyStashedOpFarm.spec.ts — timeout bump
  • packages/dds/merge-tree/src/test/client.reconnectFarm.spec.ts — timeout bump

Verification

Without fix (parallel enabled unconditionally) — perf tests fail with the reported error:

TypeError: hook.error is not a function
Test Uncaught error outside test suite failed with error:  TypeError: hook.error is not a function

With fix (--perfMode guard) — both pass:

  • npm run perf — all benchmarks pass (sequential mode, benchmark reporter works)
  • npm run test:mocha — 1555 passing, 2 pending (parallel mode, ~2 min)

Test plan

  • Reproduce hook.error is not a function failure without the fix
  • npm run perf passes with the fix (sequential mode)
  • npm run test:mocha passes with the fix (parallel mode)
  • Performance Benchmarks pipeline passes in CI

🤖 Generated with Claude Code

Re-applies the parallelization from microsoft#26624 which was reverted in microsoft#26653
because it broke the Performance Benchmarks pipeline.

The fix: make parallel execution conditional on not running in perf mode.
Mocha serializes Hook objects across worker processes in parallel mode,
which strips methods like hook.error() that the benchmark reporter relies
on. Perf tests also need sequential execution for measurement quality.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 5, 2026 20:47
@frankmueller-msft frankmueller-msft requested a review from a team as a code owner March 5, 2026 20:47
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Re-introduces Mocha parallelization for merge-tree tests in CI while disabling it for perf mode to avoid benchmark reporter issues, and adjusts farm test timeouts for parallel execution stability.

Changes:

  • Conditionally enables Mocha parallel mode with jobs: 4 when not running with --perfMode.
  • Increases timeouts for applyStashedOpFarm and reconnectFarm test suites.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
packages/dds/merge-tree/.mocharc.cjs Enables conditional Mocha parallelization and sets a fixed job count for CI.
packages/dds/merge-tree/src/test/client.applyStashedOpFarm.spec.ts Increases suite timeout to accommodate parallel execution overhead.
packages/dds/merge-tree/src/test/client.reconnectFarm.spec.ts Increases suite timeout to accommodate parallel execution overhead.

Comment on lines +123 to +125
// Matches the timeout convention used by other farm tests (conflictFarm, obliterateFarm,
// rollbackFarm). Tests complete in ~15-20s normally, but need headroom under parallel execution.
}).timeout(30 * 10000);
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30 * 10000 sets the timeout to 300,000ms (5 minutes), which doesn’t align with the comment’s stated need for modest headroom (tests ~15–20s). This looks like a multiplier typo (likely meant 30 * 1000, or another explicit ms value). Consider setting an intentional timeout (e.g., 60 * 1000) and/or extracting a named constant to avoid unit mistakes.

Copilot uses AI. Check for mistakes.
Comment on lines +212 to +214
// Matches the timeout convention used by other farm tests (conflictFarm, obliterateFarm,
// rollbackFarm). Tests complete in ~15-20s normally, but need headroom under parallel execution.
}).timeout(30 * 10000);
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

30 * 10000 sets a 5 minute timeout, which is much larger than the described headroom and likely unintentional. Use an explicit intended value in milliseconds (and consider a shared constant with other farm tests) to prevent silent CI slowdowns and masking hung tests.

Copilot uses AI. Check for mistakes.
Comment on lines +11 to +18
// Parallelize tests to speed up CI, but not in perf mode where parallel execution
// breaks the benchmark reporter and degrades measurement quality.
if (!process.argv.includes("--perfMode")) {
config.parallel = true;
// Keep job count limited to avoid excessive memory use and thread overhead in CI.
// 4 jobs was measured to give most of the speed up while not slowing down CI.
config.jobs = 4;
}
Copy link

Copilot AI Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process.argv.includes(\"--perfMode\") only matches the exact token; it won’t detect common variants like --perfMode=true / --perfMode=1. If those forms can occur in your pipelines/scripts, parallel mode would be enabled unexpectedly. Consider checking for any arg that equals --perfMode or starts with --perfMode= (or using a small arg parser) to make the gating robust.

Copilot uses AI. Check for mistakes.
@frankmueller-msft
Copy link
Contributor Author

Addressing Copilot's review:

  1. Timeout 30 * 10000 (300s) — Not a typo. This matches the existing convention used by all other farm tests in this directory (conflictFarm, obliterateFarm, rollbackFarm, replay). These are long-running fuzz tests that need generous timeouts.

  2. --perfMode variants--perfMode is a custom mocha flag used consistently as a bare flag across the entire repo (20+ scripts). It's never passed as --perfMode=true or --perfMode=1. No additional parsing needed.

CraigMacomber
CraigMacomber previously approved these changes Mar 5, 2026
config.parallel = true;
// Keep job count limited to avoid excessive memory use and thread overhead in CI.
// 4 jobs was measured to give most of the speed up while not slowing down CI.
config.jobs = 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the CI test run, there was a timeout of an unrelated test: https://dev.azure.com/fluidframework/public/_build/results?buildId=382483&view=ms.vss-test-web.build-test-results-tab&runId=7473433&resultId=100000&paneView=debug

Its possible you just got unlucky, but this suggests we might be increasing the risk of a timeout by overloading the CPU.

Since with this change the unit tests on CI are no longer the long poll, we are getting more speed up that we need at the moment, so it might be safer to go with a smaller thread count here and still get all or most of the benefit.

Maybe try 2 for a conservative starting point, then run the CI test pipeline a few times and see if it runs without any issues.

@CraigMacomber CraigMacomber dismissed their stale review March 5, 2026 21:57

Failing tests indicate this likely makes test run more flakey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants