Skip to content

New flag --shard-by-suite#2655

Open
sigurdm wants to merge 20 commits into
masterfrom
shard-by-file-docs-and-tests
Open

New flag --shard-by-suite#2655
sigurdm wants to merge 20 commits into
masterfrom
shard-by-file-docs-and-tests

Conversation

@sigurdm

@sigurdm sigurdm commented May 26, 2026

Copy link
Copy Markdown
Contributor

Running all setupAll hooks is expensive when what we are trying to do is to minimize the time spent for each shard.

This allows choosing a less fine-grained strategy of sharding by individual test files.

Fixes #2614

@sigurdm sigurdm requested a review from a team as a code owner May 26, 2026 07:57
@github-actions

github-actions Bot commented May 26, 2026

Copy link
Copy Markdown

PR Health

Changelog Entry ✔️
Package Changed Files

Changes to files need to be accounted for in their respective changelogs.

This check can be disabled by tagging the PR with skip-changelog-check.

Comment thread pkgs/test/test/runner/runner_test.dart Outdated
Comment thread pkgs/test/README.md Outdated
Comment thread pkgs/test_core/lib/src/runner.dart Outdated
Comment thread pkgs/test_core/lib/src/runner.dart Outdated
@jakemac53

Copy link
Copy Markdown
Contributor

Meta comment, but we call test "files" test "suites". So it might be best to make this --shard-by=suite. Although it may not be as intuitive for users, but it would match our general terminology.

I am curious what @natebosch thinks though.

Comment thread pkgs/test/test/runner/shard_test.dart
Comment thread pkgs/test/test/runner/shard_test.dart
Comment thread pkgs/test/test/utils.dart Outdated
Comment thread pkgs/test_core/lib/src/runner.dart Outdated
Comment thread pkgs/test_core/lib/src/runner.dart Outdated
@sigurdm sigurdm changed the title Allow --shard-by=file Allow --shard-by-file Jun 11, 2026
@sigurdm

sigurdm commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

Meta comment, but we call test "files" test "suites". So it might be best to make this --shard-by=suite. Although it may not be as intuitive for users, but it would match our general terminology.

I think as this is raw sharding by file, I like to keep the terminology there. Also I don't really see a reason for package:test to invent its own name for a file. I added a parenthetical explanation in the help-string and readme for people who know what a suite is, but don't know what a file is.

@jakemac53

Copy link
Copy Markdown
Contributor

I think as this is raw sharding by file, I like to keep the terminology there. Also I don't really see a reason for package:test to invent its own name for a file. I added a parenthetical explanation in the help-string and readme for people who know what a suite is, but don't know what a file is.

I don't disagree with you but the decision was made a long time ago to call test files suites. I couldn't tell you why. This does leak out to users via --suite-load-timeout, but I don't see a lot of other places. cc @natebosch for thoughts on this

@natebosch

natebosch commented Jun 25, 2026

Copy link
Copy Markdown
Member

Sorry for the slow response. I spent some time thinking through alternative sharding designs that can take into account at least some of the differences between test suites but I came up short. I think I lean towards landing something very close to this design. Some thoughts:

RE --shard-by-file vs --shard-by-suite

I think as this is raw sharding by file

With the "file" naming, as a user I might expect that a pattern like all_tests_test.dart which imports and runs the main of a bunch files with test cases would still shard based on which file the test() is actually written in. "Suite" might cause a reader to need to look for a definition of the term, but is unlikely to result in an incorrect interpretation like "file" might.

RE shard before or after reading top level @TestOn annotations:

If we do top-level @TestOn, tags, it seems to me asymetrical that we don't look at the single-test annotations

It feels quite natural to me that we would honor suite-level configuration when we are doing suite-level sharding. We should expect that test suites which are entirely configured for one platform would do that at the suite level instead of the test or group level

The fact that we do sorting by file path could make a situation where a big run of tests for a single platform end up in a single shard that ends up doing no testing because they were all filtered out. We should maybe consider doing a stable shuffle then sublists, or sort and round-robin distribute suites to shards so that similarly named tests are more likely to end up on different shards.

Comment thread pkgs/test/test/runner/runner_test.dart Outdated
Comment thread pkgs/test/README.md Outdated
split across shards and helps maximize the re-use of `setUpAll` and
`tearDownAll` setups.
* When sharding by file (using `--shard-by-file`): Distribute entire test files (suites)
across shards. This can be faster for suites with many small files as it

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nit] This usage of "suites" is wrong - replace with maybe "packages" or "test invocations"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment thread pkgs/test/README.md Outdated
Comment thread pkgs/test/README.md Outdated
Comment on lines +227 to +237
### Interaction with Test Filters

The sharding modes interact differently with filters like `--name` or `--tags`:

* When sharding by test (default), the sharding partition is calculated *after*
applying filters. This guarantees that the matching tests are distributed
as evenly as possible across all shards.
* When sharding by file (using `--shard-by-file`), the files are partitioned
*before* they are loaded and filtered. If a filter only matches tests in a
few files, some shards might run no tests because those files were allocated
to other shards.

@natebosch natebosch Jun 25, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should make top level test annotations work for this strategy.

I think the way we should talk about the test-case level filtering is a little bit more general. Something maybe like

Test filters which apply to individual test cases or groups (as opposed to suite level annotations at the top of a file) will be reflected in the distribution when sharding by test case, but not when sharding by test suite. Test invocations with uneven number of tests cases to run between test suites can cause uneven workloads across the shards.

@natebosch

Copy link
Copy Markdown
Member

I had gemini take a try at handling the filtering before sharding. At a skim the solution it found looks fine to me - add a cache of the MetaData to the Loader so that we can read metadata earlier than we used to without doubling up. The Runner where the sharding is implemented already has a reference to the Loader so it's feasible to introduce suite filtering eagerly when using suite sharding.

@sigurdm

sigurdm commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

Fair, I can try to give it a go next week to have top-level filtering before splitting.

@sigurdm sigurdm changed the title Allow --shard-by-file New flag --shard-by-suite Jun 29, 2026

@sigurdm sigurdm left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now suites are filtered by top-level annotations before being sharded.
The flag is now --shard-by-suite, and the phrasing has been updated everywhere.

Comment thread pkgs/test/test/runner/runner_test.dart Outdated
Comment thread pkgs/test/test/runner/shard_test.dart
@sigurdm sigurdm requested review from jakemac53 and natebosch July 2, 2026 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable dart test to shard on files, not individual tests

4 participants