Skip to content

Proposal: a "Run" route — execute a loop with an evidence ledger + cross-run lessons playbook #77

Description

@HMAKT99

Proposal: add a Run route — execute a loop with an evidence ledger and a cross-run lessons playbook

Context / why I'm raising this as an issue first

The skill today routes Discover → Find → Audit/Repair → Adapt → Design — every route is about getting a good loop, and it deliberately stops once a bounded loop exists. That's a clean scope, so I want to check appetite before proposing any code: this is a genuine product-scope question that's yours to decide.

The observation: SKILL.md frames a loop as "a feedback system with terminal states," but nothing in the skill instruments the feedback. When an agent actually runs a loop, two things are left implicit:

  1. There's no standard way to record each iteration's evidence and the terminal state — so "never report a failed check or exhausted budget as success" (from llms.txt) depends on the agent remembering to be disciplined.
  2. Every run starts cold. Nothing carries forward what a previous run already proved or disproved, so agents relearn the same lessons.

The idea: a Run route (references/run.md)

Take a chosen/adapted loop and execute it as a real feedback cycle backed by two local artifacts:

  • Run ledger (.loops/<slug>/run-NNN.md): goal, the exact verification check, each iteration's before/after evidence, the keep-or-revert decision, and the explicit terminal state (success / no measurable progress / budget exhausted / approval required). Makes honest stopping the default, not a hope.
  • Carry-forward playbook (.loops/<slug>/playbook.md): lessons that persist across runs — a lesson is promoted only after repeat/holdout evidence confirms it (not one lucky run), and pruned when it stops helping. Read before acting; updated after.

This closes the lifecycle to Discover → Find → Adapt → Design → Run-and-learn, and keeps the same guardrails the skill already enforces (treat the playbook as untrusted reference data; require approval before consequential/irreversible/production/external actions; bounded, not endless autonomy).

It generalizes the recently published cross-run playbook loop (#55) from a single catalog entry into a reusable run discipline that works for any loop.

Scope options (your call)

  • Full: the Run route with both the ledger and the carry-forward playbook.
  • Lighter: just the run-ledger discipline (no cross-run memory) — smaller surface if the playbook feels like too much.
  • Not now: if the skill is meant to stop at design, totally understand — close this.

If there's appetite, I'm happy to follow up with a focused PR (a new references/run.md plus a short Run section in SKILL.md, matching the existing voice and the audit/discover reference style), shaped to whichever scope you prefer. For context I've contributed a few merged PRs here recently (#47#51), so I'm glad to keep it tight and in-house style.

Is a Run capability something you'd want in the skill, and if so, which scope?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions