chore: ratchet test262 baseline after async/promise cluster fixes#469
Conversation
|
Warning Review limit reached
More reviews will be available in 19 minutes and 51 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4d5b5b4b44
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| }, | ||
| "combined": { | ||
| "passed_min": 45990 | ||
| "passed_min": 53550 |
There was a problem hiding this comment.
Raise the combined baseline above the per-mode floor
With these new per-mode minima, any run that passes both mode gates must have at least 27,650 + 26,900 = 54,550 passes, but this combined threshold is 53,550. The regression workflow checks the mode thresholds first and then compares total_passed to baseline["combined"]["passed_min"] (.github/workflows/test262.yml lines 489-514), so the combined gate can no longer fail independently and will report a +1,000 combined delta even when both modes only meet their minima. If the combined row is intended to guard aggregate regressions, set it to at least 54,550 or remove the redundant combined baseline.
Useful? React with 👍 / 👎.
4d5b5b4 to
4647c01
Compare
- scripts/set-baseline.py: reads passed counts from the latest successful main-branch CI run (never PR-branch) and subtracts a configurable buffer (default 100) to compute passed_min. Replaces the error-prone manual step that caused the baseline miscalibration in PR #469 (strict set to 26,900 based on a PR-branch outlier; main showed ~25,977). - AGENTS.md: two new Test262 Tool Boundaries rules — use set-baseline.py for baseline updates; verify CI claims in compacted summaries from the raw log before investigating. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore: add set-baseline.py + AGENTS.md rules for baseline calibration - scripts/set-baseline.py: reads passed counts from the latest successful main-branch CI run (never PR-branch) and subtracts a configurable buffer (default 100) to compute passed_min. Replaces the error-prone manual step that caused the baseline miscalibration in PR #469 (strict set to 26,900 based on a PR-branch outlier; main showed ~25,977). - AGENTS.md: two new Test262 Tool Boundaries rules — use set-baseline.py for baseline updates; verify CI claims in compacted summaries from the raw log before investigating. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * chore: fix set-baseline.py portability and AGENTS.md trigger condition - Derive repo from gh repo view instead of hardcoding dowdiness/js_engine - Remove --pattern flag from gh run download (broke file lookup on some runs) - Add "when to run" condition to AGENTS.md rule (after batches recovering >100 tests) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
passed_minfloors to reflect test262 improvements from PRs fix(array): respect deletion of Array.prototype[Symbol.iterator] #462–fix(async): parameter TDZ, sloppy this, arrow arguments, mapped arguments #468 (async/Promise cluster fixes)Test plan
test262-summary.jsoncombined report)🤖 Generated with Claude Code