Skip to content

Per-PR reverse-dependency gate against downstream codes#278

Open
krystophny wants to merge 6 commits into
mainfrom
feat/release-downstream-gate
Open

Per-PR reverse-dependency gate against downstream codes#278
krystophny wants to merge 6 commits into
mainfrom
feat/release-downstream-gate

Conversation

@krystophny

@krystophny krystophny commented May 28, 2026

Copy link
Copy Markdown
Member

Replaces the release-branch gate with a per-PR reverse-dependency gate.

On every libneo PR, dispatch each downstream's CI to build and fast-test against this PR's libneo commit (-f libneo_ref=<sha>), and gate the PR on green. This keeps libneo main continuously working with the downstream codes, and attributes any break to the PR that caused it.

  • .github/workflows/downstream-gate.yml: runs on pull_request (same-repo, non-draft). Dispatches, polls, and gates on each downstream's run. blocking=no rows are dispatched and reported as a warning but never fail the gate.
  • ci/downstreams: the gated set, with blocking and full columns. NEO-RT is report-only (transitive through NEO-2; its own build does not yet compile the candidate libneo). NEO-2 is full=no: it splits its CI, so the gate triggers only its fast unit-tests.yml and its slow suite stays on NEO-2 PRs.

The dispatched runs use each downstream's fast tests. Label the libneo PR full-ci to also dispatch the slow suite (golden records, performance, PAR) of each downstream whose full column is yes, with -f full=true.

Required secret

The gate authenticates as secrets.RELEASE_BOT_TOKEN. The repo or org must define this secret as a fine-grained PAT or GitHub App installation token scoped to the six downstream repositories — itpplasma/SIMPLE, itpplasma/NEO-2, itpplasma/MEPHIT, itpplasma/KAMEL, itpplasma/NEO-RT, itpplasma/rabe — with these repository permissions:

  • Actions: read and write (dispatch workflow_dispatch and read the resulting runs)
  • Contents: read
  • Metadata: read

The default GITHUB_TOKEN cannot dispatch workflows in other repositories, so this dedicated token is mandatory.

Per-downstream dispatch workflows must land first

The gate dispatches each downstream's workflow on its default branch, so the workflow_dispatch trigger (with the libneo_ref, and where full=yes the full, input) must already be merged to that downstream's default branch before this gate can dispatch it. Merge each downstream's CI-wiring PR before merging this one:

Downstream Workflow Dispatch PR
SIMPLE main.yml merged on main
NEO-2 unit-tests.yml itpplasma/NEO-2#87
MEPHIT main.yml itpplasma/MEPHIT#19
KAMEL ci.yml itpplasma/KAMEL#132
NEO-RT test.yml merged on main
rabe test.yml itpplasma/rabe#83

Until a downstream's PR lands, its gate step fails to dispatch (for a blocking=yes row that turns the libneo PR red).

Closes #293

krystophny added a commit that referenced this pull request May 29, 2026
The `build_test_golden_record_odeint` job on `main` was red.
`fetch_golden_record.sh`
read `src/odeint_allroutines.f90` from the latest tag, but odeint moved
to
`src/odeint/odeint_allroutines.f90` before v2026.04.13. The old
top-level path is
gone from the tag, so `git show
"$LATEST_TAG:src/odeint_allroutines.f90"` produced an
empty golden file and the regression comparison broke. This was the
first `main.yml`
run since the move surfaced it.

One-line fix: read the post-move path. v2026.04.13 contains
`src/odeint/odeint_allroutines.f90`, and the module names still match
the script's
seds, so the golden-record comparison keeps doing its job.

Split out of #278 so it can merge first and unbreak `main` independently
of the
release-gate change.

## Verification

Failing before, on `main` (golden file empty, comparison fails):
```
$ git show v2026.04.13:src/odeint_allroutines.f90
fatal: path 'src/odeint_allroutines.f90' does not exist in 'v2026.04.13'
```

Passing after (post-move path exists in the tag):
```
$ git show v2026.04.13:src/odeint/odeint_allroutines.f90 | head -1
!> High-Performance ODE Integration Module
```

The identical commit on #278 turned `build_test_golden_record_odeint`
green
(full CI: build+test, Sphinx docs, dashboard, Pages deploy). CI on this
branch
is running.
@krystophny krystophny force-pushed the feat/release-downstream-gate branch from 4ca6a7c to 9f170c3 Compare May 29, 2026 05:43
@krystophny krystophny force-pushed the feat/release-downstream-gate branch from 5e21c1c to b9dc9ca Compare June 8, 2026 18:22
@krystophny krystophny changed the title Add downstream reverse-dependency gate for releases Per-PR reverse-dependency gate against downstream codes Jun 8, 2026
On every libneo PR, dispatch each downstream's CI (ci/downstreams) to build and
fast-test against the PR's libneo commit, and gate on green. Keeps main working
with the downstreams and attributes breakage to the PR. Label the PR 'full-ci'
to also run each downstream's slow suite (golden/performance/PAR) before merge.
Needs RELEASE_BOT_TOKEN with actions:write on the downstreams.
A downstream workflow without a workflow_dispatch trigger returns
HTTP 422, which killed the entire gate mid-run. Guard the dispatch
call so the failure is reported as a named ::error:: and the loop
continues to the remaining repos. The gate still exits non-zero.

Part of #293
…pt-out

NEO-2 split its CI: unit-tests.yml is the always-on fast tier, the
golden-record/performance/PAR suite stays on NEO-2's own PRs. Point the
gate at unit-tests.yml and add a fourth 'full' column to ci/downstreams
so full-ci only dispatches -f full=true to downstreams whose workflow
accepts it. NEO-2 is full=no, so its fast workflow is never sent an
input it does not declare.
@krystophny

Copy link
Copy Markdown
Member Author

Merge order: merge itpplasma/NEO-2#87 first, then this PR. The gate dispatches NEO-2's unit-tests.yml on its default branch, so that workflow must be on NEO-2 main first; otherwise the NEO-2 gate step fails to dispatch and goes red.

rabe's dispatchable workflow is test.yml, not ci.yml. blocking=no rows
are now dispatched and their status reported as a warning instead of
skipped, so a report-only downstream (NEO-RT) still runs against the
candidate libneo without failing the gate.
The full column was a boolean assuming one shape (re-dispatch the single
workflow with -f full=true). Generalize it: no = fast-only, yes = that
self-escalating shape, or a comma-list of extra workflows to dispatch on
full-ci. The gate factors dispatch+watch into run_wf and runs the list.

rabe keeps its slow suite in its own golden.yml rather than a full tier in
test.yml, so rabe is full=golden.yml. No downstream is forced into one CI
shape.
Header described the old boolean full model; update it to the per-downstream
convention and state the contract: add a downstream by appending one row to
ci/downstreams, no workflow edit.
@krystophny

Copy link
Copy Markdown
Member Author

Made the gate per-downstream so each code keeps its own CI convention; no downstream is forced into one shape.

What changed:

  • ci/downstreams full column is no longer a boolean. Values: no (fast-only), yes (re-dispatch the row's workflow with -f full=true, the self-escalating shape SIMPLE/MEPHIT/KAMEL use), or a comma-list of extra workflows to dispatch on full-ci.
  • downstream-gate.yml: factored dispatch+watch into run_wf and run the resolved set per row. Fast tier always dispatches the row's workflow with -f libneo_ref=<sha>; on full-ci it self-escalates (yes) or also dispatches the listed workflows.
  • rabe is full=golden.yml: it keeps its slow suite in its own golden.yml rather than a full tier in test.yml. golden.yml now accepts libneo_ref (itpplasma/rabe#83) so the gate builds it against the candidate. benchmark.yml is a rabe-commit perf trend, so it stays on rabe PRs, not the gate.

Modularity: ci/downstreams is the single source. Adding a downstream is one appended row; the workflow file needs no edit. The chosen workflow must accept a libneo_ref input.

Dispatch dry-run against the manifest (run_wf stubbed):

  • fast: each row -> its workflow.
  • full-ci: SIMPLE/MEPHIT/KAMEL -> their workflow + full=true; NEO-2/NEO-RT fast-only; rabe -> test.yml + golden.yml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix gate job failure on PR #278 (downstream reverse-dependency gate)

1 participant