Skip to content

fix(env): symmetric filtering + splice for select + merge (#122)#126

Merged
fentas merged 17 commits into
mainfrom
fix/env-sync-symmetric-merge
Apr 7, 2026
Merged

fix(env): symmetric filtering + splice for select + merge (#122)#126
fentas merged 17 commits into
mainfrom
fix/env-sync-symmetric-merge

Conversation

@fentas

@fentas fentas commented Apr 7, 2026

Copy link
Copy Markdown
Owner

Fixes #122.

The bug

Running `b env sync` with `strategy: merge` and a `select:` filter could silently wipe the consumer's out-of-scope content. The issue describes it in detail; in short:

  • Upstream content was filtered to just the selected keys
  • Base and local were passed to `git merge-file` whole
  • The line-diff treated "filtered upstream" vs "whole base/local" as a massive deletion of everything outside the selected scope
  • That "deletion" was applied to the consumer's file, wiping their `envs:`, `profiles:`, comments, and any consumer-added content alongside the synced section

The fix

Two parts, both in `pkg/env/`:

1. Symmetric filtering in `doMerge`

`doMerge` now accepts a `filterFn` closure and applies it to base and local as well as upstream. All three sides of the 3-way merge go through the same selector, so `git merge-file` only sees semantic changes inside the selected scope.

```go
func doMerge(
repoDir, previousCommit, sourcePath, destPath string,
upstream []byte,
filterFn func(content []byte) ([]byte, error),
) ([]byte, bool, error)
```

The filter closure is built once per file in `SyncEnv` and reused across upstream filtering and the merge, so there's no drift between what gets filtered where.

2. Splice the merged scope back into the full local file

Just filtering symmetrically isn't enough — the merge produces a filtered-scope result that would, if written whole to disk, still wipe out-of-scope content. The new `pkg/env/splice.go` handles this:

Fast path (merged is valid YAML):

  • Parse local as a `yaml.v3` Node tree
  • For each in-scope top-level key, replace the value Node with the merged version
  • Out-of-scope Nodes are untouched, so their comments stay attached
  • Re-emit the tree

Conflict path (merged contains `<<<<<<<` markers, unparseable):

  • Parse local structurally to get byte ranges of each top-level key via `yaml.v3` Line metadata
  • Splice the marked merged bytes into the in-scope range
  • Out-of-scope bytes are copied verbatim
  • User sees conflict markers only inside the scoped region; `envs:`/`profiles:`/etc. survive byte-for-byte

What this PR does NOT fix

Git `merge-file` is line-based. When local and upstream both add a new entry in the same mapping of the selected scope (`local added argsh`, `upstream added tilt`), the text diff treats them as competing inserts and produces a spurious conflict marker — even though the changes are semantically independent.

The data-loss part of #122 is fixed. The cosmetic spurious-conflict part needs a structural 3-way merge (operating on `map[string]any`, not text) and is tracked as a follow-up. This is documented in both `docs/env-sync.mdx` and the test comment on `TestSyncEnv_SelectMerge_PreservesConsumerEnvs`.

Tests

New test file `pkg/env/splice_test.go`:

  • `TestTopLevelKeysFromSelectors` — selector → top-level key extraction
  • `TestContainsConflictMarkers` — marker detection (including a partial-match negative case)
  • `TestSpliceYAMLStructural_ReplacesScopedKey` — fast path replaces scoped keys, preserves out-of-scope
  • `TestSpliceYAMLStructural_RemovesScopedKeyAbsentInMerge` — merge "key should not exist" propagates to local
  • `TestSpliceYAMLText_PreservesOutOfScopeOnConflict` — conflict path keeps `envs:` intact
  • `TestSpliceYAMLText_AppendsWhenLocalHasNoScopedKey` — edge case where local has nothing in scope
  • `TestSpliceSelectedScope_JSONPassthrough` — JSON not yet supported, confirmed pass-through
  • `TestSpliceSelectedScope_NoSelectors` — no-selectors splice is a no-op

New integration test file `pkg/env/sync_select_merge_test.go` (runs against a real local bare git repo):

Docs

`docs/env-sync.mdx` gets a new subsection "merge + select" under the `merge` strategy, explaining:

  • Symmetric filtering (all three sides pass through the same selector)
  • Splice behavior (in-scope replaced, out-of-scope preserved including comments)
  • Conflict-path preservation (out-of-scope bytes are preserved even when markers are written)
  • The known spurious-conflict limitation

Test plan

  • `go test -race ./...` green on Linux + macOS
  • Lint clean
  • Docs render
  • Integration: lok8s → kubehz-cluster sync chain no longer loses consumer `envs:`

🤖 Generated with Claude Code

Before this commit, running `b env sync` with `strategy: merge` and a
select filter would silently clobber consumer-owned content. The bug:

  - Upstream content was filtered (just the selected keys)
  - Base and local were passed to git merge-file WHOLE
  - The line-diff saw "filtered upstream" as a massive deletion of
    everything outside the selected scope (other top-level keys,
    comments, profiles, envs)
  - That "deletion" was applied to the consumer's file, wiping their
    out-of-scope content

The fix has two parts:

1. doMerge now accepts a filterFn closure and applies it to base and
   local as well as upstream. All three sides of the 3-way merge go
   through the same selector, so git merge-file only sees semantic
   changes inside the selected scope.

2. SyncEnv then splices the merged (filtered) result back into the
   consumer's full local file via pkg/env/splice.go:

     - Fast path: when merged is valid YAML, walk the consumer's
       yaml.v3 Node tree and replace scoped top-level keys in place.
       Out-of-scope nodes are untouched so their comments survive.
     - Conflict path: when merged contains git merge-file conflict
       markers (unparseable YAML), compute byte ranges of scoped keys
       in local via Node Line metadata and splice the marked merged
       region in. Out-of-scope bytes are preserved verbatim, so the
       data-loss bug is fixed even in the conflict case.

Tests added (pkg/env/splice_test.go + sync_select_merge_test.go):

  - Structural splice replaces scoped keys, preserves out-of-scope
  - Structural splice removes scoped keys absent from merge result
  - Text splice preserves out-of-scope content when merge produces
    conflict markers
  - Full SyncEnv end-to-end against a local bare git repo: consumer
    has envs + profiles + a consumer-added binary, upstream adds a
    new binary, all of the consumer's content survives
  - Clean upstream-only value change applies, consumer envs survive
  - No-previous-commit path falls back gracefully

Known limitation documented in docs/env-sync.mdx and the test comment:
git merge-file is line-based, so concurrent adjacent inserts inside
the scoped region (local adds X, upstream adds Y in the same mapping)
can still produce a spurious conflict marker. The data-loss bug that
#122 called out is fixed; the cosmetic spurious-conflict issue is
tracked as a follow-up (structural 3-way merge, not text-based).

Fixes #122 (data-loss portion).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes issue #122 where b env sync with strategy: merge plus a select: filter could wipe consumer-owned, out-of-scope content by making the merge inputs symmetric and introducing a “splice” step to reintegrate the merged scope back into the full local YAML file.

Changes:

  • Apply the same select filtering to base and local as well as upstream for 3-way merges.
  • Add YAML scope splicing logic (pkg/env/splice.go) and comprehensive unit tests.
  • Add integration tests reproducing #122 and document the new merge+select semantics.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
pkg/env/env.go Builds a reusable filter closure, applies symmetric filtering in doMerge, and splices merged scope back into the full local file.
pkg/env/splice.go Implements YAML structural/text splicing of selected top-level keys back into the local file.
pkg/env/splice_test.go Unit tests for selector→key extraction, conflict marker detection, and both splice paths.
pkg/env/sync_select_merge_test.go End-to-end integration tests covering the #122 reproduction and merge+select behavior.
docs/env-sync.mdx Documents merge + select behavior, splice semantics, and limitations.

Comment thread pkg/env/env.go
Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/splice.go Outdated
Comment thread docs/env-sync.mdx
Comment thread pkg/env/sync_select_merge_test.go Outdated
@fentas

fentas commented Apr 7, 2026

Copy link
Copy Markdown
Owner Author

Review by Claude (Opus 4.6, 1M context) — context: I filed #122 and have been tracking this fix from the consumer side (lok8s → kubehz sync chain).

Verdict: approve with two small notes

Solid PR. Fixes the actual data-loss bug in #122, and is honest about what it does and doesn't solve. The split between structural splice (fast, parses both sides) and text splice (conflict fallback, parses only local for byte ranges) is the right design — you can't always go structural (conflict markers break YAML parsing) and you can't always go text (loses out-of-scope comments through re-emission). Having both with structural as default and text as the conflict fallback is exactly right.

What I like

  1. Symmetric filtering via filterFn closure built once in SyncEnv and reused in doMerge. Eliminates the drift risk between "what gets filtered where" — that drift was the core bug.

  2. The conflict-path design choice is pragmatic. Inserting the entire merged blob at the first scoped key range and emitting nothing for subsequent scoped ranges is coarser than per-key splicing but provably safe: the merge output contains all scoped keys, so inserting it once covers everything. Simpler code, no edge cases around multi-region splices.

  3. Test discipline. TestSyncEnv_SelectMerge_PreservesConsumerEnvs explicitly does not assert zero conflicts. That avoids false confidence and matches the limitation honestly documented in the PR description, the docs section, and the test comment itself.

  4. JSON pass-through is honest. Doesn't pretend to handle JSON. Returns merged. Documents it. Tests the pass-through. JSON splice can come when there's a real use case.

  5. Removing scoped keys absent in merge propagates "the merge decided this should not exist" correctly — see TestSpliceYAMLStructural_RemovesScopedKeyAbsentInMerge. Subtle but right.

Two small notes (non-blocking)

1. Add a test for non-contiguous scoped keys in the structural splice.

Both TestSpliceYAMLStructural_* tests use a single scoped key. The integration tests also use a single key. The non-contiguous case (e.g. consumer file has binaries:, then envs:, then extras:, with select: [binaries, extras]) is plausible and the splice logic should handle it — but it's not exercised. Should be ~10 lines of test code.

2. The // See TODO below pointer dangles.

In spliceYAMLStructural:
```go
// Note: yaml.Marshal re-emits the whole document, so the output won't
// be byte-identical to the input even for unchanged keys — this is a
// limitation of yaml.v3. See TODO below.
```

There is no actual TODO below. Either turn it into a real TODO with an issue link (see follow-up suggestion #1 below) or remove the "see TODO below" pointer.

Optional follow-up issues (not blocking this PR)

  1. Format-preserving marshaling. yaml.Marshal reformats unchanged content (whitespace, quoting style, ordering of unchanged maps). Every successful merge sync produces a noisy git diff for users who care about formatting. Real fix: a custom marshaler that walks the Node tree and reuses original byte ranges for untouched nodes. Big lift; can wait. Worth filing as its own issue so the dangling TODO can point at it.

  2. Structural 3-way merge. Replaces `git merge-file` with a `map[string]any` walk. Eliminates the spurious-conflict limitation that the PR description acknowledges. Would also let select-by-filter (env-sync: replace custom select syntax with JMESPath (jmespath-community fork) #124, JMESPath) work cleanly with merges that remove keys when their filter conditions stop matching.

  3. `select + replace` documentation note. Explicitly state that `replace + select` is intentionally destructive (wipes out-of-scope content), in contrast to `merge + select` which now preserves it. Looking at `SyncEnv`, the `replace` branch doesn't go through `doMerge` and writes filtered upstream directly to disk — that's correct `replace` semantics, but the contrast with `merge + select` is worth documenting so users don't get bitten by picking the wrong strategy.

Subtle semantic question (not blocking)

The "remove scoped key absent in merge" branch in spliceYAMLStructural:

```go
// Scoped key removed by merge → drop from local too.
localRoot.Content = append(localRoot.Content[:i], localRoot.Content[i+2:]...)
i -= 2
```

is correct for "the 3-way merge decided this key should not exist." But it interacts oddly with future filter-based selectors (#124 / JMESPath). If someone selects `binaries[?contains(groups, 'core')]` and an upstream change removes the `core` group from a binary, the merge result will be missing that binary. This code would then delete the binary from the consumer's local file even though the consumer might have legitimately wanted it (they pinned to `core` originally; removing the group is an upstream taxonomy change, not a "delete this" signal).

Not a bug in this PR — the behavior is correct for the current selector model. Just something to keep in mind when filter-based selection lands. The right fix later is a way to distinguish "removed because filter no longer matches" from "removed because the key was deleted."

TL;DR

Ship it. Add the non-contiguous test, fix or remove the dangling TODO pointer, and (optionally) file the format-preserving marshaler as a follow-up so the cosmetic noise has somewhere to live.

Copilot review:
  1. (KILLER) The "no local changes — safe to replace" branch was
     writing the FILTERED upstream content directly to disk, wiping
     the consumer's out-of-scope content on every sync after the
     first one. Same bug in the StrategyReplace branch and the
     merge-failed fallback. Fix: introduce computeTargetBytes() that
     splices filtered upstream into the consumer's full local file,
     and use it in all three branches. Also skip the localHash ==
     upstreamHash fast-path entirely when filterFn is set (those
     hashes are computed against different views and never match).
     New regression test: TestSyncEnv_SelectMerge_NoLocalChanges_PreservesEnvs.

  2. spliceSelectedScope returned `merged` unchanged for .json,
     silently dropping out-of-scope JSON content. Fix: error out
     for scoped JSON merges until splice support exists. Test
     updated from JSONPassthrough to JSONErrors.

  3. Text-splice fallback inserted the entire merged blob at the
     first scoped key range and suppressed subsequent scoped ranges,
     reordering content when scoped keys were non-contiguous in the
     local file. Fix: scanTopLevelKeyRanges heuristic finds per-key
     byte ranges in `merged` (works even with conflict markers
     inside one key's value, since the markers don't start with an
     unindented identifier). Each scoped local range is now spliced
     with its corresponding merged sub-range.

  4. Dangling "See TODO below" pointer in spliceSelectedScope doc
     comment had no actual TODO. Replaced with a forward reference
     to the format-preserving emitter follow-up.

  5. lineStart() doc comment said "clamps to EOF" but the
     implementation returned offsets[len-1] (start of last line).
     Fix: lineStart now takes srcLen and clamps properly.

  6. Docs didn't mention the JSON merge+select limitation. Added
     to the "Known limitations" list under merge+select.

  7. Stale comment header on TestSyncEnv_SelectMerge_ValueChangeInsideScope
     said "UpstreamRename". Fixed.

Reviewer (downstream Claude agent) notes:
  - Add a non-contiguous scoped keys structural test → added
    TestSpliceYAMLStructural_NonContiguousScopedKeys, asserts
    binaries → envs → extras stays in source order after splice
  - Dangling TODO pointer → fixed (overlaps with copilot #4)

Optional follow-ups (not addressed in this PR, will file separately):
  - Format-preserving YAML emitter (cosmetic git diffs)
  - Structural 3-way merge (eliminates spurious conflicts)
  - JSON splice support
  - select+replace docs note about intentional destructiveness

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@fentas fentas requested a review from Copilot April 7, 2026 12:31
@fentas

fentas commented Apr 7, 2026

Copy link
Copy Markdown
Owner Author

Thanks for the thorough review. Addressed both notes in a300af2:

  • Non-contiguous scoped keys test: added TestSpliceYAMLStructural_NonContiguousScopedKeys. Asserts binaries → envs → extras source order is preserved when only binaries and extras are scoped.
  • Dangling TODO pointer: removed "See TODO below" and replaced with a forward reference to the format-preserving emitter follow-up.

The semantic question about filter-removed keys is a great forward-looking observation. Filed mentally for when #124's filter selectors meet the merge path — agree the right fix is distinguishing "filter no longer matches" from "key was deleted", but that needs the structural-merge follow-up first.

Optional follow-ups noted, will file:

  • Format-preserving YAML emitter (cosmetic git diffs)
  • Structural 3-way merge (eliminates spurious conflicts)
  • select + replace docs note about intentional in-scope destructiveness
  • JSON splice support

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/splice.go
Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/env.go Outdated
Comment thread docs/env-sync.mdx Outdated
  1. (#3045013070) Doc on spliceSelectedScope claimed "verbatim"
     preservation but the structural fast path round-trips through
     yaml.v3's emitter, which can normalize whitespace and quoting
     even for unchanged keys. Doc rewritten to be precise:
     out-of-scope keys preserved, comments+order best-effort, only
     the text fallback is byte-verbatim.

  2. (#3045013111) spliceYAMLStructural returned `merged` silently
     when localRoot wasn't a mapping (sequence/scalar root), which
     would overwrite the consumer's whole file with the filtered
     scope (data loss). Fix: error out so the dispatcher falls back
     to the text path. spliceYAMLText had the same bug — it also
     returned merged silently when local couldn't be parsed or
     wasn't a mapping. Both now return errors. New test
     TestSpliceYAMLStructural_NonMappingErrors pins it.

  3. (#3045013128) scanTopLevelKeyRanges dropped bytes before the
     first top-level key (header comments, conflict markers above
     the first key, blank lines) because the first key's range
     started AT the key, not at byte 0. Fix: extend the first key's
     range back to src[0]. New test
     TestScanTopLevelKeyRanges_PreservesPrefix.

  4. (#3045013153) Comment on scanTopLevelKeyRanges said the regex
     was `^[A-Za-z_]...` but tryParseTopLevelKey actually allows
     digits and `-` as the first character. Fix: comment now says
     `^[A-Za-z0-9_-]+:` to match the implementation.

  5. (#3045013167) computeTargetBytes shortcut for non-existent
     destination files bypassed spliceSelectedScope entirely,
     which meant JSON+select silently wrote the filtered scope on
     the first sync (only erroring on the SECOND sync once a real
     file was on disk). Fix: pass nil bytes through to splice on
     not-exist; YAML's empty-doc fast path handles it, JSON now
     errors consistently. New test
     TestSpliceSelectedScope_JSONErrorsForNewFile.

  6. (#3045013184) Docs about JSON merge+select limitation
     reflected the buggy behavior. Now updated to match the
     consistent first-sync error: "b will error out at sync time
     — including the very first sync to a non-existent
     destination, so you can't accidentally end up with a
     half-written file."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment thread pkg/env/splice.go
Comment thread pkg/env/splice.go Outdated
  1. (#3045106825) spliceYAMLText didn't handle scoped key add/remove
     symmetric with the structural path. A scoped key present in
     `merged` but missing in `local` was silently dropped (lost),
     and a scoped key in `local` but missing in `merged` triggered
     the "insert whole merged blob" fallback which could duplicate
     content. Fix:

       - track which scoped keys were emitted (`consumed` set)
       - emit nothing for scoped local keys absent in merged
         (treat as deletion, matching the structural path)
       - after the loop, append any merged keys not in local
         (additions, in merged source order)

     New helper scanTopLevelKeyRangesOrdered returns the per-key
     map AND the source order so additions append in deterministic
     merge order. scanTopLevelKeyRanges is now a thin wrapper.

     New tests TestSpliceYAMLText_AddsScopedKeyMissingInLocal and
     TestSpliceYAMLText_RemovesScopedKeyAbsentInMerge pin both
     halves of the contract.

  2. (#3045106888) JSON splice error said "scoped merge is not
     supported" but spliceSelectedScope is also called from the
     non-merge replace path. Reworded to "scoped select/splicing
     is not supported" so the error makes sense regardless of
     strategy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread pkg/env/splice.go
Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/sync_select_merge_test.go Outdated
…126 round 4)

  1. (#3045162314) spliceYAMLText doc comment said the fallback for
     "scoped key not found in merged" was to insert the entire merged
     blob into that range, but the round-3 fix changed the behavior
     to "drop the range" (treat as deletion). Comment updated to
     match the actual behavior, including the explicit note that
     scanner misses on exotic inputs (quoted keys, document
     directives) will look like deletions in the text-splice path.
     Also added a step 4 explaining the additions-at-end behavior.

  2. (#3045162344) filterSha test helper comment described what
     SyncEnv writes to the lock, but the post-fix lock contains the
     spliced target SHA, not the filtered upstream SHA. The helper
     intentionally computes the OLD lock format because it's used
     to simulate pre-fix state in upgrade/regression scenarios.
     Comment now spells that out so future maintainers don't
     "fix" it to the new format.

Note: copilot also flagged a "won't compile" issue about
spliceSelectedScope using `text :=` then `switch ext`. Verified
the code is `ext := ...; switch ext` and compiles correctly — that
was a hallucination, no fix needed. (Build is green on every
commit since the symbol issue would have failed CI.)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment thread pkg/env/splice.go
Comment thread pkg/env/splice.go Outdated
…ons comment (#126 round 5)

  1. (#3045217780) topLevelKeyRanges set the in-scope key's endByte
     to the start of the next key's line, so any blank lines or
     unindented `# comment` lines immediately preceding the next
     key ended up inside the previous key's range and got silently
     dropped during the text-splice path. That's surprising when
     the comment is logically attached to the (out-of-scope) next
     key.

     Fix: new helper trimTrailingBlankAndCommentLines walks
     backwards from the next key's line and excludes consecutive
     blank/unindented-comment lines from the previous key's range.
     The floor is the start of the current key so we never trim
     into the current key's own value.

     Also added isBlankOrUnindentedComment helper. Indented
     comments are explicitly NOT trimmed because indented content
     is part of the previous key's value.

     New regression test:
     TestSpliceYAMLText_PreservesCommentAttributedToNextKey

  2. (#3045217830) scanTopLevelKeyRanges limitations comment was
     stale: it claimed misclassification falls back to "drops
     merged into the first scoped range", but the round-3 fix
     changed that to "treats as deletion" semantics. Comment
     updated to describe the actual failure mode (scoped local
     keys the scanner misses get deleted; merged keys it misses
     don't get appended).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread pkg/env/env.go
…place (#126 round 6)

Copilot round 6 caught: spliceSelectedScope only operates at top-level
key granularity, but filterFn was built from m.Select verbatim. For a
nested selector like \`database.host\`, filterContent produces
\`database: {host: ...}\` (dropping siblings like \`database.port\`),
and the splice would replace the entire \`database\` top-level node
with that truncated mapping — losing out-of-scope nested content in
the consumer file.

Fix: classify selectors as top-level vs nested up front and route
based on the strategy:

  - Top-level selector + ANY strategy → splice (the #126 fix is correct)
  - Nested selector + StrategyMerge → ERROR. The merge result is
    unrecoverable without the splice, and silently dropping siblings
    is exactly the data-loss bug #122 was opened to fix. Force the
    user to either restructure to a top-level selector or switch to
    \`strategy: replace\`.
  - Nested selector + StrategyReplace (or implicit) → SKIP THE
    SPLICE and write filtered content directly. This is the
    pre-#126 legacy behavior of \`select\`: extract a subset,
    overwrite the destination. Out-of-scope content is intentionally
    dropped because the user explicitly opted into per-key extraction
    via the nested selector.

The classification happens in one place (\`allTopLevelSelectors\`)
and is consulted by both the merge gate and computeTargetBytes.

New test \`TestSyncEnv_NestedSelector_MergeErrors\` pins the merge
rejection case. Existing nested-dot-path tests in select_test.go
continue to pass because they don't go through SyncEnv.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@fentas fentas requested a review from Copilot April 7, 2026 13:27

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread pkg/env/splice.go Outdated
…126 round 9)

Copilot round 9 caught: the spliceSelectedScope doc said nested
selectors are "flattened to their top-level key" and "the merge will
have done the right thing inside". Both are now incorrect:

  - Nested selectors don't reach the splice path. SyncEnv classifies
    them upstream (allTopLevelSelectors flag) and either errors out
    for strategy: merge or skips the splice entirely for replace.
  - "The merge will have done the right thing inside" implies the
    splice can recover from a truncated nested view, but it can't —
    that's exactly the data-loss bug round 6 fixed by rejecting
    nested+merge.

Comment now spells out the precondition: splicing is only correct
when merged contains the COMPLETE value for each in-scope top-level
key. The classification responsibility lives in SyncEnv; the splice
just relies on it. The comment points readers at the upstream flag
so they can audit the contract end-to-end.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@fentas fentas requested a review from Copilot April 7, 2026 13:57
fentas added a commit that referenced this pull request Apr 7, 2026
…ts (#127 round 7)

  1. (#3045442977) isSimpleDotPath treated selectors starting with
     `@` as simple because `@` wasn't in the JMESPath grammar
     blocklist. But leading `@` is the JMESPath current-node
     operator (\`@\`, \`@.field\`, \`@[?...]\`), so those expressions
     would be misclassified as simple and routed to the legacy
     dot-path validator, returning empty/wrong output.

     Fix: explicit early-return on \`s[0] == '@'\` after the
     leading-dot trim. \`@\` mid-key is still allowed (e.g.
     \`a@b.com\` is a valid plain key) — only the leading
     position is reserved for the current-node operator.

     New test cases in TestIsSimpleDotPath: \`@\`, \`@.foo\`,
     \`.@.foo\` all classify as complex.

  2. (#3045443041, #3045443090) Code and test comments referenced
     internal review history ("per copilot review on PR #127
     round X"). Reviewer asked for timeless wording — those
     references age poorly and the technical reason is what
     future maintainers actually need.

     Sweep: removed "copilot review on PR #127" / "round X"
     references from pkg/env/select.go, pkg/env/select_jmespath.go,
     and pkg/env/select_jmespath_test.go (the three files
     copilot flagged). Each comment was rewritten to describe
     the technical rationale directly. Git history still has
     the full context for archaeologists.

     Other branches (#126, #128) still have similar references
     that will get the same treatment when those branches see
     similar reviewer feedback. Leaving them alone here so this
     PR stays scoped.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/splice.go Outdated
Two splice fast paths used to drop local bytes when the local YAML
file had no document content / no top-level keys to replace. For a
file containing only header comments, the comments would silently
disappear on first sync.

  1. (#3045531983) spliceYAMLStructural's empty-doc fast path
     returned `merged` whenever localDoc.Content was empty. Per
     copilot suggestion: when local has bytes but no document
     content (header-only file), preserve those bytes and append
     merged. New helper-style inline logic handles three cases:
     local empty, merged empty, and concat-with-newline.

  2. (#3045532083) spliceYAMLText's len(ranges)==0 branch had the
     same bug for the conflict path: an empty `{}` mapping or a
     header-only file would lose its bytes. Same fix.

New test TestSpliceYAMLStructural_HeaderCommentsPreservedOnEmptyDoc
pins the structural-path case (header-only local + binaries merge
keeps both the header comment and the new binaries section).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@fentas fentas requested a review from Copilot April 7, 2026 14:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Comment thread pkg/env/splice.go
fentas and others added 2 commits April 7, 2026 16:28
…und 10)

Same sweep that landed on #127 and #128: code and test comments
shouldn't reference internal review history (PR/round numbers)
because those references age poorly. The technical reason should
stand on its own; git history captures the review context.

Affected files (all on the #126 branch):
  - pkg/env/env.go
  - pkg/env/splice.go
  - pkg/env/splice_test.go
  - pkg/env/sync_select_merge_test.go

31 occurrences of "Per copilot review on PR #126 round X" rewritten
to brief technical rationale. No behavioral changes.

Note: an earlier regex sweep accidentally collapsed multi-space
indentation in embedded YAML test strings. Reverted via git checkout
and used a comment-line-aware Python sweeper plus manual fixups for
the remaining cases. Verified YAML test strings still have 2-space
indentation and the test suite passes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… round 11)

Copilot round 11 caught: spliceYAMLText fails opaquely when the local
YAML file has unresolved \`git merge-file\` conflict markers (e.g.
from a previously aborted sync). The user sees "yaml: line N: did not
find expected node content" without any hint that the real cause is
unresolved markers.

Fix: detect conflict markers in local before yelling about YAML parse
errors, and return a targeted error pointing the user at the actual
remediation: resolve the markers and re-run. The yaml parse error is
still wrapped for diagnostics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread pkg/env/splice.go Outdated
Comment thread pkg/env/sync_select_merge_test.go Outdated
Comment thread pkg/env/sync_select_merge_test.go Outdated
…#126 round 12)

The python regex sweep in round 10 left three artifacts:

  1. (#3045755859) lineStart's doc comment had a fragmentary
     "The srcLen parameter [...] out-of-range lines, which is
     the start of the LAST line, not EOF — contradicting the
     doc comment." sentence with a missing subject. Replaced
     with copilot's suggested clean version that just describes
     the contract: "lineStart returns the byte offset at which
     the given 1-based line begins. ... If line is past the
     last recorded line, it clamps to srcLen (true EOF), not
     the start of the last line."

  2. (#3045755941) "view.." → "view." (double-period typo).

  3. (#3045755970) "test fora subtle regression" → "test for a
     subtle regression" (missing space between "for" and "a").

All three were collateral damage from the regex sweeps in round
10/11. Pure doc fixes, no behavioral changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Comment thread pkg/env/env.go Outdated
Comment thread pkg/env/env.go Outdated
Comment thread pkg/env/splice.go Outdated
fentas and others added 3 commits April 7, 2026 17:19
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@fentas fentas merged commit 3617709 into main Apr 7, 2026
7 checks passed
@fentas fentas deleted the fix/env-sync-symmetric-merge branch April 7, 2026 15:20
fentas pushed a commit that referenced this pull request Apr 7, 2026
🤖 I have created a release *beep* *boop*
---


## [4.13.0](v4.12.0...v4.13.0)
(2026-04-07)


### Features

* **env:** hybrid JMESPath selector for select: filters
([#124](#124))
([#127](#127))
([aaf184e](aaf184e))
* **env:** safety tiers + plan output for env update
([#125](#125) phase 1)
([#128](#128))
([3e02aaf](3e02aaf))


### Bug Fixes

* **env:** symmetric filtering + splice for select + merge
([#122](#122))
([#126](#126))
([3617709](3617709))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
fentas added a commit that referenced this pull request Apr 7, 2026
Follow-up from PR #126 review: every successful merge sync produced
a noisy git diff because spliceYAMLStructural round-tripped the
whole document through yaml.Marshal, which would normalize
whitespace, quoting, and field ordering even for keys the splice
didn't touch.

New byte-level fast path (spliceYAMLByteLevel):

  1. Parse local for yaml.Node Line/Column metadata.
  2. Compute byte ranges for every top-level key.
  3. Walk local byte-by-byte:
       - out-of-scope key range  → copy verbatim
       - in-scope key, in merged → emit yaml.Marshal of just that
                                    one key:value pair
       - in-scope key, not merged → drop (deletion)
  4. Append additions (merged keys not in local) at the end.

The output preserves whitespace, quoting, and comments BYTE-FOR-BYTE
on every byte that wasn't touched. Only the in-scope keys go through
the yaml.v3 emitter.

Dispatch order in spliceYAML is now:
  1. Byte-level splice (best fidelity)
  2. Structural splice (existing fallback for shapes byte-level
     can't handle — empty doc, non-mapping root, exotic Line metadata)
  3. Text splice (conflict path, unchanged)

New helper marshalSingleTopLevelKey serializes one key:value pair as
a top-level YAML mapping entry. Used for both substitutions and
appends in the byte-level path.

New regression test
TestSpliceYAMLByteLevel_PreservesOutOfScopeBytesVerbatim uses a
local file with non-default formatting (4-space indent,
double-quoted keys, inline comments) to prove that none of those
choices get normalized away when an unrelated top-level key is
spliced.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fentas added a commit that referenced this pull request Apr 8, 2026
Follow-up from PR #126 review: every successful merge sync used to
produce a noisy git diff because \`spliceYAMLStructural\` round-tripped
the whole document through \`yaml.Marshal\`, which normalizes
whitespace, quoting, and field ordering even for keys the splice didn't
touch.

## What's new

\`spliceYAMLByteLevel\` is a new fast path that walks \`local\`
byte-by-byte and only re-emits the in-scope keys:

1. Parse local for \`yaml.v3\` Node Line/Column metadata.
2. Compute byte ranges for every top-level key.
3. Walk local:
   - **out-of-scope** range → copy verbatim
- **in-scope, in merged** → emit \`yaml.Marshal\` of just that one
key:value pair
   - **in-scope, not in merged** → drop (deletion)
4. Append additions (merged keys not in local) at the end.

The output is **byte-identical** for every unchanged byte. Only the
in-scope keys go through the yaml.v3 emitter.

## Dispatch order in \`spliceYAML\`

1. Byte-level splice (best fidelity, NEW)
2. Structural splice (fallback for shapes byte-level can't handle —
empty doc, non-mapping root, exotic Line metadata)
3. Text splice (conflict path, unchanged)

## Test

\`TestSpliceYAMLByteLevel_PreservesOutOfScopeBytesVerbatim\` uses a
local file with non-default formatting:

\`\`\`yaml
# Top of file
binaries:
    kubectl: {}              # 4-space indent
    kustomize: {}

envs:
    \"github.com/keep/me\":     # an inline comment
        files:
            \"a.yaml\":  \"docs/a.yaml\"   # double-quoted keys
        # trailing comment inside envs
\`\`\`

After splicing in a new \`tilt:\` binary, all of the formatting choices
above (4-space indent, quoted keys, inline comment, trailing comment,
header comment) survive byte-for-byte. Before this PR, every one of them
would be reformatted by yaml.v3.

This is the format-preserving emitter mentally tracked from the #126
review loop.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

env-sync: select + merge strategy is asymmetric and unsafe for consumer-owned files

2 participants