feat(env): format-preserving byte-level YAML splice fast path#133
Merged
Conversation
Follow-up from PR #126 review: every successful merge sync produced a noisy git diff because spliceYAMLStructural round-tripped the whole document through yaml.Marshal, which would normalize whitespace, quoting, and field ordering even for keys the splice didn't touch. New byte-level fast path (spliceYAMLByteLevel): 1. Parse local for yaml.Node Line/Column metadata. 2. Compute byte ranges for every top-level key. 3. Walk local byte-by-byte: - out-of-scope key range → copy verbatim - in-scope key, in merged → emit yaml.Marshal of just that one key:value pair - in-scope key, not merged → drop (deletion) 4. Append additions (merged keys not in local) at the end. The output preserves whitespace, quoting, and comments BYTE-FOR-BYTE on every byte that wasn't touched. Only the in-scope keys go through the yaml.v3 emitter. Dispatch order in spliceYAML is now: 1. Byte-level splice (best fidelity) 2. Structural splice (existing fallback for shapes byte-level can't handle — empty doc, non-mapping root, exotic Line metadata) 3. Text splice (conflict path, unchanged) New helper marshalSingleTopLevelKey serializes one key:value pair as a top-level YAML mapping entry. Used for both substitutions and appends in the byte-level path. New regression test TestSpliceYAMLByteLevel_PreservesOutOfScopeBytesVerbatim uses a local file with non-default formatting (4-space indent, double-quoted keys, inline comments) to prove that none of those choices get normalized away when an unrelated top-level key is spliced. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new YAML splice fast path that preserves untouched bytes verbatim during select + merge env syncs, reducing noisy diffs caused by full-document yaml.v3 re-emission.
Changes:
- Introduces
spliceYAMLByteLeveland updatesspliceYAMLdispatch order to prefer byte-level splicing, then structural, then text/conflict splice. - Adds a unit test intended to validate that out-of-scope YAML formatting is preserved when splicing in-scope changes.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pkg/env/splice.go | Adds byte-level YAML splice strategy and adjusts dispatch to prioritize format-preserving behavior. |
| pkg/env/splice_test.go | Adds a new test case for byte-level splice preservation behavior. |
- spliceYAML doc comment now accurately describes the byte-level path: out-of-scope bytes are byte-identical, but in-scope ranges are re-encoded as full key:value pairs and may have key quoting/style and key-line comments rewritten. - TestSpliceYAMLByteLevel_PreservesOutOfScopeBytesVerbatim now asserts exact byte equality on the envs: slice instead of substring matching, so any reformatter touching that region fails the test even when the surrounding lines still appear. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Validate top-level key byte ranges before slicing in spliceYAMLByteLevel. yaml.v3 Line/Column metadata can be unreliable for flow-style mappings or single-line documents, and an out-of-bounds slice would panic at runtime instead of letting spliceYAML fall back to the structural / text path. - Preserve CRLF line endings when the local file uses them. yaml.v3 always emits LF, so without this the spliced regions would produce a mixed-ending file (CRLF in preserved bytes, LF in spliced regions) — noisy in diffs and bad for Windows-aware tooling. New usesCRLF helper detects the local file's convention and the byte-level splice rewrites emitted blocks (and the appended-keys separator newline) to match. - Strategy doc comment now refers to marshalSingleTopLevelKey instead of yaml.Marshal, which is what the implementation actually uses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… round 3) - Range validation now requires startByte < endByte (was allowing startByte == endByte, which let zero-length yaml.v3 metadata for flow-style mappings sneak through and emit garbage). - Hoist the usesCRLF call out of the per-key splice loop. Scanning the entire file inside the loop made the splice O(n*k); the line-ending convention can't change between keys, so detecting it once is enough. - New TestSpliceYAMLByteLevel_PreservesCRLFLineEndings test covers the CRLF normalization path end-to-end: a CRLF local file stays CRLF after the splice with no bare LFs anywhere in the output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- spliceYAMLByteLevel doc now says explicitly that the in-scope key line is also re-emitted (not preserved from local), so key quoting/style and key-line comments may change. Out-of-scope bytes still keep their original formatting. - TestSpliceYAMLByteLevel_PreservesCRLFLineEndings now checks the envs:-section bytes.Index results before slicing. A regression that dropped envs: from either side previously would have panicked instead of producing a useful failure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nd 5) The naive bytes.Contains(\\n) && !bytes.Contains(\\r\\n) check was always false on a CRLF file because CRLF itself contains \\n. The explicit per-byte loop below is the real check; remove the dead guard and document why. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…nd 6) The implementation uses yaml.NewEncoder().Encode() and the marshalSingleTopLevelKey helper, not yaml.Marshal directly. Update the splice strategy doc comments so the wording matches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
pkg/env/splice.go:36
- The spliceSelectedScope doc comment still describes only two strategies (structural + text), but spliceYAML now has a byte-level fast path that is the default for successful YAML merges. Please update this top-level comment to reflect the three-path dispatch so callers understand when output is byte-preserving vs re-encoded.
// Two paths:
//
// - Structural splice (fast path): when `merged` parses as valid YAML,
// rewrite the local Node tree in place (replace scoped key values,
// append new scoped keys, remove vanished scoped keys). Out-of-scope
// comments and layout are preserved because their Nodes are untouched.
// Note: the yaml.v3 encoder re-emits the whole document, so the output won't
// be byte-identical to the input even for unchanged keys — this is a
// limitation of yaml.v3 that the structural splice cannot work around
// here. A format-preserving emitter is tracked as a separate
// follow-up.
…133 round 7) - New TestSpliceYAMLByteLevel_AppendsNewScopedKey covers the "additions" branch: local has only out-of-scope keys, merged introduces a new scoped top-level key, and the splice must append it at EOF with byte-identical preservation of the existing local content and a separating newline. - Update the stale comment in TestSpliceYAMLByteLevel_PreservesOutOfScopeBytesVerbatim that still mentioned yaml.Marshal — the implementation uses the yaml.v3 encoder. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…round 8) The doc still described only the structural fast path + text fallback and called the format-preserving emitter a "separate follow-up". This PR added the byte-level fast path; rewrite the top-level comment to describe the three-path dispatch in order of fidelity (byte-level → structural → text) and drop the stale follow-up wording. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously usesCRLF returned true if the file contained ANY \r\n, which meant a mostly-LF file with a stray CRLF would have spliced regions rewritten to CRLF — increasing the mixing instead of fixing it. New rule: a file is CRLF only when at least one \r\n is present AND every \n is preceded by \r. Mixed-ending files stay LF so the splice produces output that matches yaml.v3's emitter and the dominant convention. New TestUsesCRLF covers all five edge cases. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fentas
pushed a commit
that referenced
this pull request
Apr 8, 2026
🤖 I have created a release *beep* *boop* --- ## [4.14.0](v4.13.0...v4.14.0) (2026-04-08) ### Features * **cli:** b env resolve for merge conflicts ([#125](#125) phase 4) ([#138](#138)) ([b3d00c8](b3d00c8)) * **cli:** plan-json polish — destructive row in error + effectiveDryRun helper ([#132](#132)) ([4ad5dfa](4ad5dfa)) * **cli:** surface conflict markers in env status ([#139](#139)) ([1e54c73](1e54c73)) * **env:** format-preserving byte-level YAML splice fast path ([#133](#133)) ([94cd711](94cd711)) * **env:** JSON splice support for scoped selects ([#135](#135)) ([89ccf9d](89ccf9d)) * **env:** orphan-file delete plumbing ([#125](#125) phase 3) ([#137](#137)) ([5891781](5891781)) * **env:** per-key b.pin annotation ([#125](#125) phase 2) ([#136](#136)) ([9ed59db](9ed59db)) * **env:** structural 3-way merge for YAML/JSON ([#134](#134)) ([fd787b0](fd787b0)) * **env:** wrapKeyFor extracts leading identifier for filter expressions ([#131](#131)) ([f542f28](f542f28)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up from PR #126 review: every successful merge sync used to produce a noisy git diff because `spliceYAMLStructural` round-tripped the whole document through `yaml.Marshal`, which normalizes whitespace, quoting, and field ordering even for keys the splice didn't touch.
What's new
`spliceYAMLByteLevel` is a new fast path that walks `local` byte-by-byte and only re-emits the in-scope keys:
The output is byte-identical for every unchanged byte. Only the in-scope keys go through the yaml.v3 emitter.
Dispatch order in `spliceYAML`
Test
`TestSpliceYAMLByteLevel_PreservesOutOfScopeBytesVerbatim` uses a local file with non-default formatting:
```yaml
Top of file
binaries:
kubectl: {} # 4-space indent
kustomize: {}
envs:
"github.com/keep/me": # an inline comment
files:
"a.yaml": "docs/a.yaml" # double-quoted keys
# trailing comment inside envs
```
After splicing in a new `tilt:` binary, all of the formatting choices above (4-space indent, quoted keys, inline comment, trailing comment, header comment) survive byte-for-byte. Before this PR, every one of them would be reformatted by yaml.v3.
This is the format-preserving emitter mentally tracked from the #126 review loop.
🤖 Generated with Claude Code