Skip to content

fix(shared-infra): remove stale managed scripts the core no longer ships (#3076)#3098

Open
montfort wants to merge 5 commits into
github:mainfrom
montfort:fix/3076-stale-agent-context-cleanup
Open

fix(shared-infra): remove stale managed scripts the core no longer ships (#3076)#3098
montfort wants to merge 5 commits into
github:mainfrom
montfort:fix/3076-stale-agent-context-cleanup

Conversation

@montfort

Copy link
Copy Markdown

Description

Fixes #3076.

install_shared_infra never removed shared scripts that a prior (pre-refactor) install recorded but the current core no longer ships — e.g. the legacy scripts/<variant>/update-agent-context.sh, superseded by the bundled agent-context extension. On a legacy project the orphan lingers, and because a refreshed common.sh no longer exports HAS_GIT, running the old script crashes (HAS_GIT: unbound variable under set -u). As discussed in the issue, the right fix is stale-file cleanup on (re)install, not re-introducing HAS_GIT/branch handling into a deleted script.

This applies the same stale-removal that integration_upgrade already performs (_migrate_commands.py) to install_shared_infra: after the bundle is processed, any manifest-tracked script under .specify/scripts/<variant>/ that the current bundle no longer produces is removed. Safety:

  • Only managed copies are removed (on-disk hash matches the manifest, via the existing _is_managed). A user-customized file (hash diverges), a symlink, or a recovered entry is preserved.
  • Guarded by scripts_scanned — if the script source is missing/empty (or dest_variant is symlinked and skipped), the cleanup does not run, so it can never trigger mass deletion.
  • The safe-destination check is applied before unlink, so a symlinked ancestor can't let a delete escape the project root.

Also drops the stale update-agent-context.sh reference in CONTRIBUTING.md (it is no longer sourced by common.sh), and adds a small IntegrationManifest.remove() so the manifest stays consistent without mutating its internals from outside.

Scoped to the script variant the bug is about; templates are untouched. PowerShell legacy orphans are covered by the same code path (it iterates the active variant).

Testing

  • Ran existing tests with uv sync --extra test && .venv/bin/python -m pytest tests -q4176 passed, 3 skipped (ruff clean)
  • New unit tests (tests/integrations/test_cli.py): stale managed script removed + manifest stops tracking it; user-modified stale preserved (hash-safe)
  • Tested with a sample project (manual repro below)

Manual test results

Agent: Claude Code · OS/Shell: Linux / bash

Command tested Notes
specify init --here --force --integration claude --script sh (with an injected manifest-tracked legacy update-agent-context.sh) Prints ⚠ Removed 1 obsolete shared script(s): .specify/scripts/bash/update-agent-context.sh; orphan gone from disk and manifest; common.sh and the other shipped scripts preserved

AI Disclosure

  • I did use AI assistance.

I used Claude Code (Anthropic) to explore the codebase, implement the fix following the existing integration_upgrade stale-cleanup pattern, and write the tests. All changes were reviewed and validated locally by me before submission.


Note: I left the CHANGELOG.md entry out — entries carry the merged PR number and the <!-- insert new changelog below this comment --> marker suggests your release flow manages insertion. Happy to add one if you'd prefer it in the PR.

…ips (github#3076)

install_shared_infra never removed shared scripts a prior (pre-refactor) install recorded but the current core no longer ships — e.g. the legacy scripts/<variant>/update-agent-context.sh, superseded by the bundled agent-context extension. On a legacy project the orphan lingers and crashes when it sources a refreshed common.sh (HAS_GIT unbound under set -u).

Apply the stale-removal that integration_upgrade already performs to install_shared_infra: manifest-tracked scripts the current bundle no longer produces are removed, but only managed copies (hash matches the manifest); user-customized files, symlinks, and recovered entries are preserved. Guarded so a missing/empty source can't trigger mass deletion, and the safe-destination check prevents unlinking through a symlinked ancestor.

Add IntegrationManifest.remove(); drop the stale update-agent-context.sh reference in CONTRIBUTING.md.

AI assistance: implemented with Claude Code (Anthropic); reviewed and validated locally (ruff clean, full suite 4176 passed, manual CLI repro).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates Spec Kit’s shared infrastructure installer to proactively remove manifest-tracked legacy scripts that the core no longer ships (addressing #3076), preventing orphaned scripts from breaking after refactors (e.g., old update-agent-context.sh sourcing a newer common.sh).

Changes:

  • Track which shared script/template paths the current bundle produces and delete obsolete managed scripts from prior installs during install_shared_infra.
  • Add IntegrationManifest.remove() to keep manifests consistent when callers delete tracked files.
  • Add regression tests for stale-script deletion vs. preserving user-modified stale scripts; update CONTRIBUTING.md to remove the stale script reference.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
src/specify_cli/shared_infra.py Implements stale managed-script cleanup during shared infra install, with safety checks and manifest updates.
src/specify_cli/integrations/manifest.py Adds IntegrationManifest.remove() helper to drop tracked entries without reaching into internals.
tests/integrations/test_cli.py Adds unit tests covering stale managed-script removal and preservation when modified.
CONTRIBUTING.md Removes mention of legacy update-agent-context.sh as a transitive dependency of common.sh.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/specify_cli/shared_infra.py Outdated
Comment thread src/specify_cli/shared_infra.py Outdated
…phan manifest)

- Set scripts_scanned only after a real source file is seen, so an empty variant source can't trigger mass deletion of tracked scripts.
- Prune a stale manifest entry even when its file is already gone from disk, keeping the manifest consistent (previously left tracked forever).
- Add a test for each edge case.

Addresses the Copilot review comments on github#3098. AI assistance: Claude Code (Anthropic), reviewed/validated locally (ruff clean, full suite 4178 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@montfort

Copy link
Copy Markdown
Author

Thanks for the review — both points addressed in ffca99c:

  1. Empty/skipped source could mass-delete (shared_infra.py): scripts_scanned is now set only after a real source file is discovered in the loop, not on entering the variant dir. An empty (or symlink-skipped) source therefore leaves it False and stale-cleanup is skipped entirely. Added test_shared_infra_empty_script_source_keeps_tracked_scripts (monkeypatches the source to an empty bash/).

  2. Orphan manifest entry when the file is already gone: when a stale path no longer exists on disk, the manifest entry is now pruned anyway (nothing to unlink, but the manifest stays consistent) instead of being tracked forever. Added test_shared_infra_prunes_orphan_manifest_entry_when_file_absent.

Full suite green (4178 passed), ruff clean.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 2

Comment thread src/specify_cli/shared_infra.py
Comment thread src/specify_cli/shared_infra.py Outdated
- Skip absolute / '..' manifest keys before any filesystem access in stale-cleanup, so a corrupted/hand-edited manifest can't make it touch paths outside the project root (mirrors IntegrationManifest.check_modified / uninstall).
- Clarify the scripts_scanned comment: the safety hinge is that flag, not seen_rels (which also holds template paths).
- Add a containment test: a traversal manifest key is skipped, its target untouched.

Addresses the second round of Copilot review on github#3098. AI assistance: Claude Code (Anthropic); validated locally (ruff clean, full suite 4179 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@montfort

Copy link
Copy Markdown
Author

Thanks — second round addressed in the latest push:

  1. Containment of unsafe manifest keys (shared_infra.py): stale-cleanup now skips any key that is absolute or contains a .. segment before any filesystem access (mirrors the guards in IntegrationManifest.check_modified / uninstall), so a corrupted/hand-edited manifest can't make it stat/unlink paths outside the project root. Added test_shared_infra_stale_cleanup_ignores_unsafe_manifest_keys (a .. key with a matching hash — would be unlinked without the guard — leaves its target untouched).

  2. Comment clarity: reworded the scripts_scanned note to say the safety hinge is that flag, not seen_rels (which also holds template paths).

Full suite green (4179 passed), ruff clean.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 4/4 changed files
  • Comments generated: 1

Comment on lines +235 to +245
def remove(self, rel_path: str | Path) -> bool:
"""Drop *rel_path* from the tracked file set and any recovered marker.

Operates purely on the manifest's recorded key; it does NOT touch the
file on disk. Returns ``True`` if an entry was present and removed.
Used to keep the manifest consistent after a caller deletes a stale
managed file that the current install no longer ships.
"""
normalized = Path(rel_path).as_posix()
self._recovered_files.discard(normalized)
return self._files.pop(normalized, None) is not None

@mnriem mnriem left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address Copilot feedback

…s (review)

IntegrationManifest.remove() now applies the same lexical validation and normalization as record_existing() / is_recovered(): absolute paths and '..' segments are rejected (return False) instead of being used verbatim as a key. Keeps the manifest API consistent. Adds tests (valid drop + no-op, absolute rejected, traversal rejected).

Addresses the third round of Copilot review on github#3098. AI assistance: Claude Code (Anthropic); validated locally (ruff clean, full suite 4182 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@montfort

Copy link
Copy Markdown
Author

Thanks @mnriem — all of Copilot's feedback is now addressed across the three rounds:

  1. Empty/skipped script source could mass-delete tracked scripts → scripts_scanned is set only after a real source file is discovered.
  2. Orphan manifest entry when the file was already gone from disk → the entry is now pruned for consistency.
  3. Stale-cleanup touched raw manifest keys before validating containment → it now skips absolute / .. keys before any filesystem access (mirrors check_modified / uninstall).
  4. IntegrationManifest.remove() API inconsistency (latest) → it now rejects absolute / .. keys lexically, matching record_existing / is_recovered.

Each fix has a regression test. Full suite green (4182 passed), ruff clean. Ready for re-review whenever you have a moment — thanks for the careful look.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 5/5 changed files
  • Comments generated: 1

Comment on lines +552 to +558
# Guard corrupted/hand-edited manifest keys BEFORE any filesystem
# access: an absolute path or a ``..`` segment could point outside
# the project root. Mirrors the containment checks in
# IntegrationManifest.check_modified / uninstall.
rel_path = Path(rel)
if rel_path.is_absolute() or ".." in rel_path.parts:
continue

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — addressed in fecb5e3.

The guard now runs the canonical containment helper after the fast lexical reject: _validate_rel_path(rel_path, project_path) (the same one IntegrationManifest.is_recovered / remove use) resolves the join and confirms it stays within the project root. A Windows drive-relative key like C:tmp\file — not is_absolute(), but discarding the root when joined — now fails containment and the key is continue-skipped before any stat/unlink. And because the check is wrapped so a ValueError skips the key instead of propagating, a corrupted manifest no longer turns into an install-time hard failure.

Added a regression test (test_shared_infra_stale_cleanup_skips_escaping_key_without_failing) that forces _validate_rel_path to reject a managed key — portably simulating the drive-relative escape, since real drive-relative paths only escape on Windows — and asserts the install skips it without raising while still installing the real scripts. Full suite green (4183 passed, 3 skipped; ruff clean).

@mnriem

mnriem commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Please address Copilot feedback

…ust lexically (review)

The stale-script cleanup guarded manifest keys with a lexical check only
(is_absolute() / ".." segments). On Windows a drive-relative key such as
"C:tmp\\file" is not is_absolute(), yet joining it onto the project path
discards the root — so cleanup could stat/unlink outside the project before
_ensure_safe_shared_destination raised, and a corrupted manifest key turned
into an install-time hard failure (ValueError) instead of being skipped.

Reuse the canonical containment helper (_validate_rel_path, the same one
IntegrationManifest.is_recovered / remove use): after the fast lexical reject,
resolve the join and confirm it stays within the project root; a key that
still escapes is skipped, never unlinked, never fatal.

Adds a regression test that forces _validate_rel_path to reject a managed key
(portably simulating the Windows drive-relative escape) and asserts the
install skips it without failing and still installs the real scripts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: update-agent-context.sh crashes ("HAS_GIT: unbound variable") when SPECIFY_FEATURE is unset, despite feature.json resolving the feature

3 participants