skill-auditor: MatrixScan AR coverage audit + gap fill#65
Merged
Conversation
Add the MatrixScan AR (BarcodeAr) edge-case taxonomy (25 curated features) and a generic `platform_aliases` mechanism so several skill dirs can fold into one logical platform with aggregated evals. MatrixScan AR splits iOS into three skills (matrixscan-ar-ios core + -highlight-ios + -annotation-ios) while every other platform uses one. The manifest now aliases the two iOS sub-skills onto matrixscan-ar-ios, so coverage_matrix audits a single accurate `ios` column instead of treating the sub-skills as pseudo-platforms. The mechanism is a no-op for every other product (empty alias map). Taxonomy encodes SDK :available: ground truth as exclusions: barcode-filter is buildable today only on iOS/Android (8.1); custom-highlight/-annotation exist only where a real path does (android/flutter/ios/rn). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add tagged evals (and reference coverage where missing) for the gaps the skill-auditor surfaced for this platform, grounded in the SDK :available: docs. New code snippets/fixtures were build-gate verified against the real resolved Scandit SDK where a cheap gate exists. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Runs the
internal/skill-auditorflow over the MatrixScan AR (BarcodeAr) skill family — taxonomy → coverage matrix → gap fill → build-gate → eval harness — closing every coverage gap across all 10 platforms.Taxonomy + iOS modeling
taxonomies/matrixscan-ar.yaml: 25 curated edge-case features, mined from public docs, the SDK.rst:available:directives (availability ground truth), the sample apps, and existing evals.-ioscore +-highlight-ios+-annotation-ios) while every other platform uses one. Added a genericplatform_aliasesmechanism tocoverage_matrix.py+manifest.jsonso the three fold into one logicalioscolumn with aggregated evals — a no-op for every other product.Coverage: 63 → 0 gaps (0 required)
Parallel per-platform fills added tagged, convention-compliant evals (and reference coverage where missing), grounded in the SDK docs.
The build gate earned its keep
Anti-hallucination compile checks against the real resolved SDK caught issues string/semantic evals never would:
ScanditIconType.Info/Plusin the RN reference → fixed to the realScanditIconBuilderAPI..NETnamespace mismatches (ScanditIconBuilder→Core.UI.Icon, etc.).BarcodeArFilteris documented at 8.5/8.6 but not in any published JS/Flutter/.NET package — would have shipped non-compilable code.ScanditIconType.Information/ShoppingCart→ realInspectItem/ToPick(GATE-PASS).barcode-filterpolicyDocumented at 8.5 (Flutter/RN/Cordova/Capacitor) / 8.6 (.NET) but those packages aren't published yet. Per decision, excluded until published — kept only on iOS/Android (8.1, buildable today); reference notes accurately say "documented at 8.5, don't generate yet."
Honest exclusions
custom-highlight/custom-annotationare scoped to where a real path exists (android/flutter/ios/rn via native protocol subclassing or theBarcodeArCustom*class) —.NEThas nocreateView/updatehook, web/JS-bridge frameworks have neither.Eval harness — 97% (1497/1537)
Full run over all 12 skill dirs (Sonnet generator + Opus judge, deterministic-first):
A first run scored 92%; the gap was mostly measurement artifacts — migration old-API-absence was judged against the model's explanatory prose (which names the old API) instead of the code blocks, plus
"X conformance is present"literal substring checks. Fixed the harness (code-block scoping for migration negatives, per the documented convention) and routed conformance checks to the semantic judge; the corrected run is 97%. The residual ~3% are genuine generator-variance misses (e.g. the model occasionally emits.with(systemName:)instead of.withIcon(.checkmark), orCGSizeinstead of aCGFloatsize) where the iOS reference doesn't pin the exact Swift API — flagged as a reference-hardening follow-up, not papered over.🤖 Generated with Claude Code