Skip to content

Permit -1 sentinel for GHAzDO build definition id#3117

Merged
michaelcfanning merged 1 commit into
devfrom
users/mikefan/ghazdo-builddef-id-sentinel
Jun 30, 2026
Merged

Permit -1 sentinel for GHAzDO build definition id#3117
michaelcfanning merged 1 commit into
devfrom
users/mikefan/ghazdo-builddef-id-sentinel

Conversation

@michaelcfanning

Copy link
Copy Markdown
Member

Summary

ADO uses -1 as the buildDefinitionId for a pipeline run that is not associated with a saved build definition. Our contract previously rejected only 0, which was wrong in both directions:

  • it permitted arbitrary negatives (-2, -3, …) that ADO never emits, and
  • on the emit side it rejected the legitimate -1 sentinel.

That inconsistency is exactly the failure mode this change eliminates: the valid set is now {positive integers} ∪ {-1}, with 0 and any negative other than -1 rejected — enforced identically at all three points where the contract lives.

The contract, in three places

Enforcement point Before After
GHAzDO1019.ProvidePipelineProperties (validation rule) rejects == 0 only (permits -2) rejects unless > 0 or == -1
AdoPipelineContext (emit-run producer) rejects <= 0 (rejects -1) buildDefinitionId allows -1; buildId stays strictly positive
ai-sarif-log.schema.json buildDefinitionId pattern ^-?(?!0+$)[0-9]+$ (any non-zero) ^(-1|[1-9][0-9]*)$

The emit-side allowance is threaded via an allowNegativeOneSentinel flag passed only at the buildDefinitionId call site, so the shared positive-int helper continues to gate buildId strictly.

The schema pattern also drops the old tolerance for leading zeros ("007"); serialized .NET ints never carry them, so no real producer is affected.

Tests

Boundary cases (0, -1, -2) exercised explicitly at every enforcement point — rule (GHAzDOValidationRuleTests), schema (AILogSchemaDriftTests), and emit (AdoPipelineContextTests, incl. a buildId == -1 rejection guard). 118 targeted tests green at Release.

Bumps to v5.4.3.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

@michaelcfanning michaelcfanning requested a review from cfaucon as a code owner June 30, 2026 19:54
ADO uses -1 as the buildDefinitionId for a pipeline run not associated
with a saved build definition. The contract previously rejected only 0,
which both wrongly permitted arbitrary negatives (-2, -3) and, on the
emit side, wrongly rejected the legitimate -1 sentinel.

Align the three enforcement points on a single contract -- valid iff the
id is a positive integer or exactly -1; 0 and any other negative are
rejected:

- GHAzDO1019.ProvidePipelineProperties validation rule.
- The emit-run AdoPipelineContext producer (buildId stays strictly
  positive; only buildDefinitionId gets the sentinel allowance).
- The ai-sarif-log.schema.json buildDefinitionId pattern.

Bump to v5.4.3.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@michaelcfanning michaelcfanning force-pushed the users/mikefan/ghazdo-builddef-id-sentinel branch from e552ac7 to 6e8fee3 Compare June 30, 2026 20:19
@michaelcfanning michaelcfanning merged commit 7420336 into dev Jun 30, 2026
6 checks passed
@michaelcfanning michaelcfanning deleted the users/mikefan/ghazdo-builddef-id-sentinel branch June 30, 2026 20:25
michaelcfanning added a commit that referenced this pull request Jul 1, 2026
…tional extension guid guard) (#3118)

* Register the publish-to-ghas skill with get-skill (#3101)

The publish-to-ghas skill ships skills/publish-to-ghas/SKILL.md in the
repository but get-skill could neither enumerate nor resolve it: the skill
was absent from both the embedded-resource set in the library csproj and the
SkillSourceDirectory catalog in GetSkillCommand. As a result `get-skill
--list` showed only emit-sarif, publish-to-ghazdo, and validate-sarif, and
`get-skill publish-to-ghas` failed with "is not an available skill" — leaving
agents to hand-roll the GHAS publish from `publish-to-ghas --help`.

Register the skill in both places:

- Add an <EmbeddedResource> entry so SKILL.md is embedded as
  Microsoft.CodeAnalysis.Sarif.Multitool.Skills.publish-to-ghas.SKILL.md.
- Add ["publish-to-ghas"] = "skills/publish-to-ghas" to SkillSourceDirectory
  so the skill is enumerated by --list and resolvable by name.

The existing catalog tests iterate SkillSourceDirectory, so they could not
catch a SKILL.md present on disk but missing from the catalog. Add a
disk-driven regression test asserting every skills/<name>/SKILL.md is
registered in the catalog.

Fixes #3099.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Exempt repo-less --no-repo runs from the version-control-provenance requirement (#3102)

emit-finalize --no-repo deliberately finalizes a scan that has no version
control: it omits versionControlProvenance (VCP) and stamps the run
properties.unpublishable = true. But --validate ran the AI profile, whose
AI1004.ProvideVersionControlProvenance rule (Error) faults a run with no VCP,
so `emit-finalize --no-repo --validate` always failed AI1004 — the validator
faulted the run for the provenance --no-repo asserts it has none of (#3100).

Fix coherently across both SDKs, with one shared source of truth:

C# rule (the .NET --validate path runs the rule engine, not the schema):
- AI1004.ProvideVersionControlProvenance.Analyze returns early when the run is
  stamped unpublishable (EmitFinalizeCommand.UnpublishablePropertyName). Every
  other AI-profile check still applies.

Shared schema (ai-sarif-log.schema.json is the source of truth copied into the
TS package at build, so this one edit fixes TS validation too):
- versionControlProvenance is removed from aiLogRun.required and gated behind
  if/then/else: a run whose properties.unpublishable === true is exempt;
  every other run still requires VCP. The VCP property schema (minItems: 1) is
  unchanged, so a present-but-empty VCP still fails.

TS producer parity (the TS multitool had no --no-repo, so it could not even
reach the scenario):
- cli.ts: add the --no-repo flag (HELP + wiring).
- emitFinalize.ts: add the noRepo option; stamp properties.unpublishable.
- rebase.ts: add rebaseRunRepoless, mirroring EmitFinalizeRebaseVisitor's
  repoless path — reject a run that declares VCP under --no-repo, leak-check
  rooted-local locations, and elide transient local file:// roots.

Tests:
- C#: AI1004 unit test (unpublishable run -> no AI1004); emit-finalize
  --no-repo --validate integration test (no AI1004 in the report); schema
  drift test accepting a VCP-less unpublishable run and still rejecting an
  unmarked one.
- TS: emit-finalize-no-repo.feature (stamp + elide, clean --validate, and the
  VCP-present contradiction); validator unit tests for the exemption.

Fixes #3100.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Converge emit-finalize --validate channeling with the emit batch verbs (#3104)

* Channel emit-finalize --validate failures to stderr with a structured report

`emit-finalize --validate` reported its verdict in a way that was unreliable to
debug from a CI pipeline, where only stderr is consistently captured: the .NET
path wrote its count summary to stdout (even on failure) and only the firing
rule ids to stderr, stashing per-error detail in a report file; the TS path
streamed full prose errors to stderr and produced no report file at all. The
two SDKs disagreed with each other, and neither matched the channel discipline
the emit-* verbs already follow.

Anchor the behavior on the emit-verb precedent (EmitBatchProcessor: the
operation's structured result rides on the produced artifact, stderr is reserved
for the can't-produce case, the exit code carries the verdict) and the compiler
convention (concise, actionable diagnostics to stderr; the full structured
record to a separate sink). Validation errors are first-class structured data
(SARIF), so the durable record is a report file and the human-actionable summary
goes to stderr.

.NET (EmitFinalizeCommand.RunValidatorAndReport):
- On non-conformance, write the count header plus one concise line per
  Error-level finding (`ruleId @ uri:line:col — message`, capped at 20, overflow
  collapsed to "...and N more") and the report pointer, all to stderr. stdout
  carries none of the failure detail.
- A conforming run keeps its one-line count summary on stdout and deletes the
  report; the full validate-report.sarif is retained only on failure.

TS (@microsoft/sarif-multitool-ts):
- validate.ts now carries structured `details` (instancePath, message, keyword)
  alongside the existing `errors` string[] (public API unchanged).
- New validateReport.ts mirrors the .NET channeling: capped per-error stderr
  summary, and a synthesized, valid `<output>.validate-report.sarif` durable
  sink so both SDKs leave the same artifact. cli.ts delegates to it.

Adds focused tests on both sides (stderr-not-stdout channeling, report-file
persistence, the cap), updates the TS README Validation section and the .NET
--validate help text, and records a BUG bullet in the v5.4.1 ReleaseHistory
section. No version bump: v5.4.1 has not yet shipped a package.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Converge emit-finalize --validate channeling with the emit batch verbs

emit-finalize --validate now emits a structured JSON receipt on stdout on
every run, pass or fail — { conforms, profile, errorCount, warningCount,
noteCount, reportPath, errors } carrying the full, uncapped error set. This
makes the verb the machine-readable twin of the emit batch verbs'
{ appended, rejected } receipt: the structured verdict lives on stdout (the
pipeable channel), the concise human summary stays on stderr capped at 20 for
CI logs, the complete findings persist to <output>.validate-report.sarif, and
the exit code carries the verdict.

.NET and @microsoft/sarif-multitool-ts channel identically. Each error entry is
the flat { ruleId, location, message } twin of the stderr detail line; full
physical-location detail stays in the report file.

Updates the emit-sarif skill, the TS README, and the EmitFinalizeOptions
HelpText to describe the three-channel contract, and revises the v5.4.1
ReleaseHistory bullet.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Ship CweCategories.json and a TS category API in @microsoft/sarif-multitool-ts (#3106)

MITRE CWE Categories (e.g. CWE-16 "Configuration") are organizational
groupings, never a valid result.ruleId mapping target. They do not appear
in CweTaxonomy.sarif, which carries only Weaknesses, so a Category cannot be
recognized from the taxonomy's cwe/abstraction field. The npm package shipped
neither the Category data nor an API to query it, so a downstream consumer
trying to flag a Category-mapped result by scanning getCweTaxonomy() silently
no-ops.

Ship the data and the API the .NET side already has:

- copy-assets.mjs bundles src/Sarif/Taxonomies/CweCategories.json into assets/.
- cweCategories.ts exports getCweCategories, isCweCategory, and
  getCweCategoryName, mirroring CweCategories.cs (CategoryNamesByCwe,
  IsCategory, TryGetCategoryName). Identifier parsing mirrors
  CweSecuritySeverity.TryGetCweNumber: CWE- prefix required (any case),
  sub-id stripped at the first slash, digits-only, leading zeros tolerated;
  a bare number, the NOVEL- form, and non-strings resolve to undefined.
- index.ts re-exports the three functions; package.json runs the new
  test/cweCategories.test.js in test:unit (10 positive/negative cases).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Document the CWE category API and tighten validate channel docs

Bring the skills and README into sync with the two changes shipping in
this release: the emit-finalize --validate channel convergence and the new
CWE category API in @microsoft/sarif-multitool-ts.

- validate-sarif SKILL.md: add the shipped AI1016.ProvideValidRuleId row;
  note that a TS-only consumer can close the Category-vs-Weakness gap with
  isCweCategory/getCweCategoryName rather than scanning the taxonomy, which
  carries Weaknesses only and cannot surface Categories via cwe/abstraction.
- emit-sarif SKILL.md: state the conformance path explicitly — the receipt
  carries reportPath null and the validate-report.sarif is deleted; the
  report file is kept only on failure.
- README.md: add a CWE helpers section documenting getCweCategories,
  isCweCategory, and getCweCategoryName; tighten the conforming-run wording
  to say the report is deleted, not merely not written.
- multitool-usage.md: annotate the emit-finalize --validate example with the
  stdout receipt / stderr summary / report-on-failure / non-zero-exit contract.

The npm skills/ copies are generated from skills/ at prebuild, so only the
canonical sources are committed here.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix AI/SARIF emit-validation gaps surfaced by an AI scanner run (#3114)

* Fix AI/SARIF emit-validation gaps surfaced by an AI scanner run

Three independent correctness fixes to AI/SARIF emit validation, plus the
v5.4.2 version bump and release notes.

SARIF2012.ProvideRuleProperties: a NOVEL- rule denotes an AI finding with no
catalog entry, so there is no help topic to point at. Demanding a helpUri there
pushes a producer to attest to a page that does not exist. The missing-helpUri
note is now skipped for NOVEL- ids; CWE-backed and conventional rules, which do
have stable documentation, continue to be noted.

emit-invocations time policy: the emit verb ingests an already-completed run, so
it never invents timing. It no longer stamps a receipt-time endTimeUtc default
(relabeling the emit machine's clock as the tool's clock is fabrication), and an
omitted endTimeUtc is left absent rather than filled or rejected. Any provided
startTimeUtc, endTimeUtc, or notification timeUtc that lies more than five
minutes past the emit clock is rejected as an impossible time for a completed
run. Notification timeUtc remains required (presence). SarifLogger's own
stamping is untouched (there the SDK genuinely is the running tool).

AI1003.ProvideRequiredRegionProperties: a region expressed purely by char-offset
or byte-offset is a complete SARIF s3.30 region. AI1003 no longer faults a region
that lacks startLine when it carries an offset- or byte-based form, matching the
fix already applied to SARIF2017.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use full SARIF2017 moniker in ReleaseHistory to satisfy style gate

ReleaseHistoryStyleTests.ReleaseHistory_RuleIdsUseQuotedMonikers requires
every rule id in an entry to appear as a backtick-wrapped Id.PascalName
moniker. The AI1003 bullet cited a bare \SARIF2017\; expand it to
\SARIF2017.ProvideRequiredRegionProperties\.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Permit -1 sentinel for GHAzDO build definition id (#3117)

ADO uses -1 as the buildDefinitionId for a pipeline run not associated
with a saved build definition. The contract previously rejected only 0,
which both wrongly permitted arbitrary negatives (-2, -3) and, on the
emit side, wrongly rejected the legitimate -1 sentinel.

Align the three enforcement points on a single contract -- valid iff the
id is a positive integer or exactly -1; 0 and any other negative are
rejected:

- GHAzDO1019.ProvidePipelineProperties validation rule.
- The emit-run AdoPipelineContext producer (buildId stays strictly
  positive; only buildDefinitionId gets the sentinel allowance).
- The ai-sarif-log.schema.json buildDefinitionId pattern.

Bump to v5.4.3.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Guard optional toolComponent.guid in SarifLogger extension mapping (#3108)

* Release v5.4.1 (ship: emit-finalize --validate convergence + CWE category API) (#3107)

* Register the publish-to-ghas skill with get-skill (#3101)

The publish-to-ghas skill ships skills/publish-to-ghas/SKILL.md in the
repository but get-skill could neither enumerate nor resolve it: the skill
was absent from both the embedded-resource set in the library csproj and the
SkillSourceDirectory catalog in GetSkillCommand. As a result `get-skill
--list` showed only emit-sarif, publish-to-ghazdo, and validate-sarif, and
`get-skill publish-to-ghas` failed with "is not an available skill" — leaving
agents to hand-roll the GHAS publish from `publish-to-ghas --help`.

Register the skill in both places:

- Add an <EmbeddedResource> entry so SKILL.md is embedded as
  Microsoft.CodeAnalysis.Sarif.Multitool.Skills.publish-to-ghas.SKILL.md.
- Add ["publish-to-ghas"] = "skills/publish-to-ghas" to SkillSourceDirectory
  so the skill is enumerated by --list and resolvable by name.

The existing catalog tests iterate SkillSourceDirectory, so they could not
catch a SKILL.md present on disk but missing from the catalog. Add a
disk-driven regression test asserting every skills/<name>/SKILL.md is
registered in the catalog.

Fixes #3099.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Exempt repo-less --no-repo runs from the version-control-provenance requirement (#3102)

emit-finalize --no-repo deliberately finalizes a scan that has no version
control: it omits versionControlProvenance (VCP) and stamps the run
properties.unpublishable = true. But --validate ran the AI profile, whose
AI1004.ProvideVersionControlProvenance rule (Error) faults a run with no VCP,
so `emit-finalize --no-repo --validate` always failed AI1004 — the validator
faulted the run for the provenance --no-repo asserts it has none of (#3100).

Fix coherently across both SDKs, with one shared source of truth:

C# rule (the .NET --validate path runs the rule engine, not the schema):
- AI1004.ProvideVersionControlProvenance.Analyze returns early when the run is
  stamped unpublishable (EmitFinalizeCommand.UnpublishablePropertyName). Every
  other AI-profile check still applies.

Shared schema (ai-sarif-log.schema.json is the source of truth copied into the
TS package at build, so this one edit fixes TS validation too):
- versionControlProvenance is removed from aiLogRun.required and gated behind
  if/then/else: a run whose properties.unpublishable === true is exempt;
  every other run still requires VCP. The VCP property schema (minItems: 1) is
  unchanged, so a present-but-empty VCP still fails.

TS producer parity (the TS multitool had no --no-repo, so it could not even
reach the scenario):
- cli.ts: add the --no-repo flag (HELP + wiring).
- emitFinalize.ts: add the noRepo option; stamp properties.unpublishable.
- rebase.ts: add rebaseRunRepoless, mirroring EmitFinalizeRebaseVisitor's
  repoless path — reject a run that declares VCP under --no-repo, leak-check
  rooted-local locations, and elide transient local file:// roots.

Tests:
- C#: AI1004 unit test (unpublishable run -> no AI1004); emit-finalize
  --no-repo --validate integration test (no AI1004 in the report); schema
  drift test accepting a VCP-less unpublishable run and still rejecting an
  unmarked one.
- TS: emit-finalize-no-repo.feature (stamp + elide, clean --validate, and the
  VCP-present contradiction); validator unit tests for the exemption.

Fixes #3100.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Converge emit-finalize --validate channeling with the emit batch verbs (#3104)

* Channel emit-finalize --validate failures to stderr with a structured report

`emit-finalize --validate` reported its verdict in a way that was unreliable to
debug from a CI pipeline, where only stderr is consistently captured: the .NET
path wrote its count summary to stdout (even on failure) and only the firing
rule ids to stderr, stashing per-error detail in a report file; the TS path
streamed full prose errors to stderr and produced no report file at all. The
two SDKs disagreed with each other, and neither matched the channel discipline
the emit-* verbs already follow.

Anchor the behavior on the emit-verb precedent (EmitBatchProcessor: the
operation's structured result rides on the produced artifact, stderr is reserved
for the can't-produce case, the exit code carries the verdict) and the compiler
convention (concise, actionable diagnostics to stderr; the full structured
record to a separate sink). Validation errors are first-class structured data
(SARIF), so the durable record is a report file and the human-actionable summary
goes to stderr.

.NET (EmitFinalizeCommand.RunValidatorAndReport):
- On non-conformance, write the count header plus one concise line per
  Error-level finding (`ruleId @ uri:line:col — message`, capped at 20, overflow
  collapsed to "...and N more") and the report pointer, all to stderr. stdout
  carries none of the failure detail.
- A conforming run keeps its one-line count summary on stdout and deletes the
  report; the full validate-report.sarif is retained only on failure.

TS (@microsoft/sarif-multitool-ts):
- validate.ts now carries structured `details` (instancePath, message, keyword)
  alongside the existing `errors` string[] (public API unchanged).
- New validateReport.ts mirrors the .NET channeling: capped per-error stderr
  summary, and a synthesized, valid `<output>.validate-report.sarif` durable
  sink so both SDKs leave the same artifact. cli.ts delegates to it.

Adds focused tests on both sides (stderr-not-stdout channeling, report-file
persistence, the cap), updates the TS README Validation section and the .NET
--validate help text, and records a BUG bullet in the v5.4.1 ReleaseHistory
section. No version bump: v5.4.1 has not yet shipped a package.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Converge emit-finalize --validate channeling with the emit batch verbs

emit-finalize --validate now emits a structured JSON receipt on stdout on
every run, pass or fail — { conforms, profile, errorCount, warningCount,
noteCount, reportPath, errors } carrying the full, uncapped error set. This
makes the verb the machine-readable twin of the emit batch verbs'
{ appended, rejected } receipt: the structured verdict lives on stdout (the
pipeable channel), the concise human summary stays on stderr capped at 20 for
CI logs, the complete findings persist to <output>.validate-report.sarif, and
the exit code carries the verdict.

.NET and @microsoft/sarif-multitool-ts channel identically. Each error entry is
the flat { ruleId, location, message } twin of the stderr detail line; full
physical-location detail stays in the report file.

Updates the emit-sarif skill, the TS README, and the EmitFinalizeOptions
HelpText to describe the three-channel contract, and revises the v5.4.1
ReleaseHistory bullet.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Ship CweCategories.json and a TS category API in @microsoft/sarif-multitool-ts (#3106)

MITRE CWE Categories (e.g. CWE-16 "Configuration") are organizational
groupings, never a valid result.ruleId mapping target. They do not appear
in CweTaxonomy.sarif, which carries only Weaknesses, so a Category cannot be
recognized from the taxonomy's cwe/abstraction field. The npm package shipped
neither the Category data nor an API to query it, so a downstream consumer
trying to flag a Category-mapped result by scanning getCweTaxonomy() silently
no-ops.

Ship the data and the API the .NET side already has:

- copy-assets.mjs bundles src/Sarif/Taxonomies/CweCategories.json into assets/.
- cweCategories.ts exports getCweCategories, isCweCategory, and
  getCweCategoryName, mirroring CweCategories.cs (CategoryNamesByCwe,
  IsCategory, TryGetCategoryName). Identifier parsing mirrors
  CweSecuritySeverity.TryGetCweNumber: CWE- prefix required (any case),
  sub-id stripped at the first slash, digits-only, leading zeros tolerated;
  a bare number, the NOVEL- form, and non-strings resolve to undefined.
- index.ts re-exports the three functions; package.json runs the new
  test/cweCategories.test.js in test:unit (10 positive/negative cases).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Document the CWE category API and tighten validate channel docs

Bring the skills and README into sync with the two changes shipping in
this release: the emit-finalize --validate channel convergence and the new
CWE category API in @microsoft/sarif-multitool-ts.

- validate-sarif SKILL.md: add the shipped AI1016.ProvideValidRuleId row;
  note that a TS-only consumer can close the Category-vs-Weakness gap with
  isCweCategory/getCweCategoryName rather than scanning the taxonomy, which
  carries Weaknesses only and cannot surface Categories via cwe/abstraction.
- emit-sarif SKILL.md: state the conformance path explicitly — the receipt
  carries reportPath null and the validate-report.sarif is deleted; the
  report file is kept only on failure.
- README.md: add a CWE helpers section documenting getCweCategories,
  isCweCategory, and getCweCategoryName; tighten the conforming-run wording
  to say the report is deleted, not merely not written.
- multitool-usage.md: annotate the emit-finalize --validate example with the
  stdout receipt / stderr summary / report-on-failure / non-zero-exit contract.

The npm skills/ copies are generated from skills/ at prebuild, so only the
canonical sources are committed here.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Guard optional toolComponent.guid in SarifLogger extension mapping

When a Run supplied to SarifLogger declared tool.extensions, the
constructor unconditionally read extension.Guid.Value while building
ExtensionGuidToIndexMap. The guid property is optional on a
toolComponent (SARIF spec §3.19.6), so any extension that omitted it
caused the SarifLogger constructor to throw, crashing every consumer
that attached a guidless extension.

Guard the lookup so only extensions that declare a guid are added to
ExtensionGuidToIndexMap. RecordRules still runs for every extension, so
a guidless extension's rules remain recorded against its extension
index; the extension is simply not addressable by guid.

Add two regression tests that build a Run whose tool.extensions mixes a
guidless extension with a guid-bearing one: one asserts the constructor
no longer throws and the guidless extension's rules are still recorded,
the other asserts the guid-bearing extension is mapped to its index.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Added Aasim to codeowners file (#3116)

* Release v5.4.2 (AI/SARIF emit-validation fixes: NOVEL helpUri, emit-time honesty, region offset parity) (#3115)

* Register the publish-to-ghas skill with get-skill (#3101)

The publish-to-ghas skill ships skills/publish-to-ghas/SKILL.md in the
repository but get-skill could neither enumerate nor resolve it: the skill
was absent from both the embedded-resource set in the library csproj and the
SkillSourceDirectory catalog in GetSkillCommand. As a result `get-skill
--list` showed only emit-sarif, publish-to-ghazdo, and validate-sarif, and
`get-skill publish-to-ghas` failed with "is not an available skill" — leaving
agents to hand-roll the GHAS publish from `publish-to-ghas --help`.

Register the skill in both places:

- Add an <EmbeddedResource> entry so SKILL.md is embedded as
  Microsoft.CodeAnalysis.Sarif.Multitool.Skills.publish-to-ghas.SKILL.md.
- Add ["publish-to-ghas"] = "skills/publish-to-ghas" to SkillSourceDirectory
  so the skill is enumerated by --list and resolvable by name.

The existing catalog tests iterate SkillSourceDirectory, so they could not
catch a SKILL.md present on disk but missing from the catalog. Add a
disk-driven regression test asserting every skills/<name>/SKILL.md is
registered in the catalog.

Fixes #3099.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Exempt repo-less --no-repo runs from the version-control-provenance requirement (#3102)

emit-finalize --no-repo deliberately finalizes a scan that has no version
control: it omits versionControlProvenance (VCP) and stamps the run
properties.unpublishable = true. But --validate ran the AI profile, whose
AI1004.ProvideVersionControlProvenance rule (Error) faults a run with no VCP,
so `emit-finalize --no-repo --validate` always failed AI1004 — the validator
faulted the run for the provenance --no-repo asserts it has none of (#3100).

Fix coherently across both SDKs, with one shared source of truth:

C# rule (the .NET --validate path runs the rule engine, not the schema):
- AI1004.ProvideVersionControlProvenance.Analyze returns early when the run is
  stamped unpublishable (EmitFinalizeCommand.UnpublishablePropertyName). Every
  other AI-profile check still applies.

Shared schema (ai-sarif-log.schema.json is the source of truth copied into the
TS package at build, so this one edit fixes TS validation too):
- versionControlProvenance is removed from aiLogRun.required and gated behind
  if/then/else: a run whose properties.unpublishable === true is exempt;
  every other run still requires VCP. The VCP property schema (minItems: 1) is
  unchanged, so a present-but-empty VCP still fails.

TS producer parity (the TS multitool had no --no-repo, so it could not even
reach the scenario):
- cli.ts: add the --no-repo flag (HELP + wiring).
- emitFinalize.ts: add the noRepo option; stamp properties.unpublishable.
- rebase.ts: add rebaseRunRepoless, mirroring EmitFinalizeRebaseVisitor's
  repoless path — reject a run that declares VCP under --no-repo, leak-check
  rooted-local locations, and elide transient local file:// roots.

Tests:
- C#: AI1004 unit test (unpublishable run -> no AI1004); emit-finalize
  --no-repo --validate integration test (no AI1004 in the report); schema
  drift test accepting a VCP-less unpublishable run and still rejecting an
  unmarked one.
- TS: emit-finalize-no-repo.feature (stamp + elide, clean --validate, and the
  VCP-present contradiction); validator unit tests for the exemption.

Fixes #3100.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Converge emit-finalize --validate channeling with the emit batch verbs (#3104)

* Channel emit-finalize --validate failures to stderr with a structured report

`emit-finalize --validate` reported its verdict in a way that was unreliable to
debug from a CI pipeline, where only stderr is consistently captured: the .NET
path wrote its count summary to stdout (even on failure) and only the firing
rule ids to stderr, stashing per-error detail in a report file; the TS path
streamed full prose errors to stderr and produced no report file at all. The
two SDKs disagreed with each other, and neither matched the channel discipline
the emit-* verbs already follow.

Anchor the behavior on the emit-verb precedent (EmitBatchProcessor: the
operation's structured result rides on the produced artifact, stderr is reserved
for the can't-produce case, the exit code carries the verdict) and the compiler
convention (concise, actionable diagnostics to stderr; the full structured
record to a separate sink). Validation errors are first-class structured data
(SARIF), so the durable record is a report file and the human-actionable summary
goes to stderr.

.NET (EmitFinalizeCommand.RunValidatorAndReport):
- On non-conformance, write the count header plus one concise line per
  Error-level finding (`ruleId @ uri:line:col — message`, capped at 20, overflow
  collapsed to "...and N more") and the report pointer, all to stderr. stdout
  carries none of the failure detail.
- A conforming run keeps its one-line count summary on stdout and deletes the
  report; the full validate-report.sarif is retained only on failure.

TS (@microsoft/sarif-multitool-ts):
- validate.ts now carries structured `details` (instancePath, message, keyword)
  alongside the existing `errors` string[] (public API unchanged).
- New validateReport.ts mirrors the .NET channeling: capped per-error stderr
  summary, and a synthesized, valid `<output>.validate-report.sarif` durable
  sink so both SDKs leave the same artifact. cli.ts delegates to it.

Adds focused tests on both sides (stderr-not-stdout channeling, report-file
persistence, the cap), updates the TS README Validation section and the .NET
--validate help text, and records a BUG bullet in the v5.4.1 ReleaseHistory
section. No version bump: v5.4.1 has not yet shipped a package.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Converge emit-finalize --validate channeling with the emit batch verbs

emit-finalize --validate now emits a structured JSON receipt on stdout on
every run, pass or fail — { conforms, profile, errorCount, warningCount,
noteCount, reportPath, errors } carrying the full, uncapped error set. This
makes the verb the machine-readable twin of the emit batch verbs'
{ appended, rejected } receipt: the structured verdict lives on stdout (the
pipeable channel), the concise human summary stays on stderr capped at 20 for
CI logs, the complete findings persist to <output>.validate-report.sarif, and
the exit code carries the verdict.

.NET and @microsoft/sarif-multitool-ts channel identically. Each error entry is
the flat { ruleId, location, message } twin of the stderr detail line; full
physical-location detail stays in the report file.

Updates the emit-sarif skill, the TS README, and the EmitFinalizeOptions
HelpText to describe the three-channel contract, and revises the v5.4.1
ReleaseHistory bullet.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Ship CweCategories.json and a TS category API in @microsoft/sarif-multitool-ts (#3106)

MITRE CWE Categories (e.g. CWE-16 "Configuration") are organizational
groupings, never a valid result.ruleId mapping target. They do not appear
in CweTaxonomy.sarif, which carries only Weaknesses, so a Category cannot be
recognized from the taxonomy's cwe/abstraction field. The npm package shipped
neither the Category data nor an API to query it, so a downstream consumer
trying to flag a Category-mapped result by scanning getCweTaxonomy() silently
no-ops.

Ship the data and the API the .NET side already has:

- copy-assets.mjs bundles src/Sarif/Taxonomies/CweCategories.json into assets/.
- cweCategories.ts exports getCweCategories, isCweCategory, and
  getCweCategoryName, mirroring CweCategories.cs (CategoryNamesByCwe,
  IsCategory, TryGetCategoryName). Identifier parsing mirrors
  CweSecuritySeverity.TryGetCweNumber: CWE- prefix required (any case),
  sub-id stripped at the first slash, digits-only, leading zeros tolerated;
  a bare number, the NOVEL- form, and non-strings resolve to undefined.
- index.ts re-exports the three functions; package.json runs the new
  test/cweCategories.test.js in test:unit (10 positive/negative cases).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Document the CWE category API and tighten validate channel docs

Bring the skills and README into sync with the two changes shipping in
this release: the emit-finalize --validate channel convergence and the new
CWE category API in @microsoft/sarif-multitool-ts.

- validate-sarif SKILL.md: add the shipped AI1016.ProvideValidRuleId row;
  note that a TS-only consumer can close the Category-vs-Weakness gap with
  isCweCategory/getCweCategoryName rather than scanning the taxonomy, which
  carries Weaknesses only and cannot surface Categories via cwe/abstraction.
- emit-sarif SKILL.md: state the conformance path explicitly — the receipt
  carries reportPath null and the validate-report.sarif is deleted; the
  report file is kept only on failure.
- README.md: add a CWE helpers section documenting getCweCategories,
  isCweCategory, and getCweCategoryName; tighten the conforming-run wording
  to say the report is deleted, not merely not written.
- multitool-usage.md: annotate the emit-finalize --validate example with the
  stdout receipt / stderr summary / report-on-failure / non-zero-exit contract.

The npm skills/ copies are generated from skills/ at prebuild, so only the
canonical sources are committed here.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix AI/SARIF emit-validation gaps surfaced by an AI scanner run (#3114)

* Fix AI/SARIF emit-validation gaps surfaced by an AI scanner run

Three independent correctness fixes to AI/SARIF emit validation, plus the
v5.4.2 version bump and release notes.

SARIF2012.ProvideRuleProperties: a NOVEL- rule denotes an AI finding with no
catalog entry, so there is no help topic to point at. Demanding a helpUri there
pushes a producer to attest to a page that does not exist. The missing-helpUri
note is now skipped for NOVEL- ids; CWE-backed and conventional rules, which do
have stable documentation, continue to be noted.

emit-invocations time policy: the emit verb ingests an already-completed run, so
it never invents timing. It no longer stamps a receipt-time endTimeUtc default
(relabeling the emit machine's clock as the tool's clock is fabrication), and an
omitted endTimeUtc is left absent rather than filled or rejected. Any provided
startTimeUtc, endTimeUtc, or notification timeUtc that lies more than five
minutes past the emit clock is rejected as an impossible time for a completed
run. Notification timeUtc remains required (presence). SarifLogger's own
stamping is untouched (there the SDK genuinely is the running tool).

AI1003.ProvideRequiredRegionProperties: a region expressed purely by char-offset
or byte-offset is a complete SARIF s3.30 region. AI1003 no longer faults a region
that lacks startLine when it carries an offset- or byte-based form, matching the
fix already applied to SARIF2017.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use full SARIF2017 moniker in ReleaseHistory to satisfy style gate

ReleaseHistoryStyleTests.ReleaseHistory_RuleIdsUseQuotedMonikers requires
every rule id in an entry to appear as a backtick-wrapped Id.PascalName
moniker. The AI1003 bullet cited a bare \SARIF2017\; expand it to
\SARIF2017.ProvideRequiredRegionProperties\.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Michael C. Fanning <mikefan@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Sukanth Gunda <sukanthgunda@microsoft.com>
Co-authored-by: Aasim Malladi <aasim.malladi@gmail.com>

* Bump Microsoft.NET.Test.Sdk from 17.12.0 to 18.7.0 (#3111)

---
updated-dependencies:
- dependency-name: Microsoft.NET.Test.Sdk
  dependency-version: 18.7.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Add ReleaseHistory bullet for SarifLogger optional extension guid guard

#3108 guarded SarifLogger against an extension toolComponent that omits the
optional guid, but shipped without a v5.4.3 release note. Add the bullet so
the release entry covers all consumer-facing changes in the package.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Sukanth Gunda <contact.sukanth@gmail.com>
Co-authored-by: Sukanth Gunda <sukanthgunda@microsoft.com>
Co-authored-by: Aasim Malladi <aasim.malladi@gmail.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant