LeoLin990405 · LeoLin990405 · May 28, 2026 · May 28, 2026 · May 28, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,5 +1,27 @@
 # 📜 Changelog
 
+## v5.2.0 (unreleased) — R2 Evaluation & Learning Quality 🏛️
+
+Multi-judge blind evaluation, skill deduplication, match replay, and prompt-bank (iteration plan: [ITERATION_PLAN.md](./ITERATION_PLAN.md), Round 2).
+
+### Added
+- **`engine/v5/multi-judge.mjs`** — blind multi-judge aggregation for tournaments. Civ names are anonymized (Civ-A, Civ-B …) before each judge sees the transcript to prevent name-recognition bias. N independent providers run in sequence; their scores are parsed from Markdown tables and averaged. De-anonymization restores real regime ids in the final `Map<regime, scores>`. (`anonymizePrompt`, `parseScoreTable`, `aggregateJudgements`, `runMultiJudge`)
+- **`engine/v5/skill-quality.mjs`** — deterministic skill deduplication (no LLM calls). SHA-256 fingerprints catch exact duplicates; word-level Jaccard similarity (threshold 0.6, words >3 chars) catches near-duplicates. `analyzeSkillsDir` produces a full quality report: total, duplicate groups, unique topics, first/last sedimentation dates.
+- **`engine/v5/replay.mjs`** — re-run a past match with identical regime/backend/task. Reads `meta.json` from the original match, mints a new `matchId` with a `replay-` prefix, writes a `replayOf` lineage field, and spawns `run-v5.mjs` with injectable `_spawn`/`_runV5` for testability.
+- **`engine/prompts/governance-scenarios.json`** — 10 curated governance challenge scenarios (military, political, economic, diplomatic, crisis) for use with `--prompt-bank`.
+- **`civagent tournament --multi-judge [--judges N]`** — run N judges in blind mode; default N=2.
+- **`civagent tournament --prompt-bank [--seed N]`** — pick a random (or seeded) scenario from the built-in prompt bank.
+- **`civagent replay <matchId>`** — re-run any past match.
+- **`civagent skills <regime> --stats`** — show dedup quality analysis for a regime's skill library.
+- **Tests** — 32 → 55: `multi-judge.test.mjs` (11 cases), `skill-quality.test.mjs` (16 cases), `replay.test.mjs` (7 cases).
+
+### Changed
+- **`engine/v5/skill-sediment.mjs`** — near-duplicate gate added before writing a skill file; uses `findDuplicate` from `skill-quality.mjs`.
+- **`engine/v5/tournament.mjs`** — judge logic refactored into `buildJudgePrompt`/`judgeSingle`/`judgeMulti`; `runTournament` accepts `multiJudge` and `judgesN`; `pickScenario` added for prompt-bank.
+- **`package.json` `lint:syntax`** — includes `multi-judge.mjs`, `skill-quality.mjs`, `replay.mjs`.
+
+---
+
 ## v5.1.0 (unreleased) — R1 Engine Robustness 🔧
 
 Backend robustness pass (iteration plan: [ITERATION_PLAN.md](./ITERATION_PLAN.md), Round 1). Focus: correctness, concurrency safety, and removing the hard external dependency on Gemini.

diff --git a/ITERATION_PLAN.md b/ITERATION_PLAN.md
@@ -109,9 +109,11 @@ CivAgent v5 是跑在 Claude Code runtime 上的多智能体编排系统：57
 - **验收**：`civagent setup` 不再提示 gemini，准确反映可用后端。
 
 ### Round 2 — 评测严谨性 + 学习闭环质量（前端②上线后并行）
-- 多裁判 / 盲评：N 个非 gemini provider 各打分后聚合，消除单裁判偏差（解决 V5-DESIGN 里"裁判=单一 gemini"的可信度问题）。
-- skill 沉淀质量度量：去重相似 skill、跨局重复 pattern 检测、沉淀产出率指标。
-- 对局可复现：固定题库 + 记录 provider/版本/seed，支持 `civagent replay <matchId>`。
+- ✅ 多裁判 / 盲评：N 个非 gemini provider 各打分后聚合，消除单裁判偏差（`multi-judge.mjs`，`--multi-judge` / `--judges N`）。
+- ✅ skill 沉淀质量度量：去重相似 skill（SHA-256 精确 + Jaccard 近似，threshold 0.6）、跨局重复 pattern 检测、`--stats` 分析报告（`skill-quality.mjs`）。
+- ✅ 对局可复现：固定题库 10 场景（`governance-scenarios.json`，`--prompt-bank / --seed`），支持 `civagent replay <matchId>`（`replay.mjs`，`replayOf` 血统字段）。
+- ✅ 测试覆盖：multi-judge 11 用例、skill-quality 16 用例、replay 7 用例（共 34 新用例，累计 ~55）。
+- **前端 R2（进行中）**：形态②政体可视化浏览器（PR 待提，由 antigravity 负责）。
 
 ### Round 3 — 控制台后端能力（前端③上线）
 - 写 API：发起对局/锦标赛、在线编辑 regime（带 schema 校验 + 史实审稿入环）、管理 skill 库（启用/禁用/删除）。

diff --git a/bin/civagent b/bin/civagent
@@ -26,19 +26,27 @@ ${BOLD}Usage:${NC}
   civagent switch <regime>              Set active regime
   civagent run [prompt]                 Launch CC with active regime's agents (v4 mode)
   civagent run --v5 "task"              Launch v5 with isolated HOME + skill sedimentation
-  civagent skills <regime>              List learned skills for a regime
+  civagent skills <regime> [--stats]    List learned skills (--stats: dedup analysis)
   civagent match-log                    Recent match transcripts
+  civagent replay <matchId>             Re-run a past match with identical params
   civagent tournament --civs a,b,c,d "task"
                                         Parallel match across civilizations + judge ranking
+  civagent tournament --civs a,b,c,d --multi-judge [--judges N] "task"
+                                        Blind multi-judge evaluation (default N=2)
+  civagent tournament --civs a,b,c,d --prompt-bank [--seed N]
+                                        Use a random scenario from the built-in prompt bank
   civagent agents                       Show generated CC agents for active regime
   civagent modes                        List 6 orchestration modes
   civagent setup                        Check tool availability (CC, Codex, opencode, cn-cc)
 
 ${BOLD}Examples:${NC}
   civagent switch china/tang
   civagent run --v5 "重构这个模块的代码"
+  civagent replay 2024-01-15T10-30-00-000-ab12
+  civagent skills china/tang --stats
   civagent tournament --civs china/tang,china/qin,global/athens,global/roman-republic \\
                      "how do we handle a famine on the eastern frontier?"
+  civagent tournament --civs china/tang,china/qin --multi-judge --judges 3 --prompt-bank
 
 ${BOLD}Regimes:${NC}
   20 Chinese dynasties: xia, shang, zhou, qin, han, tang, song, ming, qing, ...
@@ -258,7 +266,15 @@ cmd_setup() {
 
 cmd_skills() {
   local regime="${1:-}"
-  [[ -n "$regime" ]] || { echo "Usage: civagent skills <region/regime-id>"; exit 1; }
+  local stats=false
+  shift || true
+  while [[ $# -gt 0 ]]; do
+    case "$1" in
+      --stats) stats=true; shift ;;
+      *) shift ;;
+    esac
+  done
+  [[ -n "$regime" ]] || { echo "Usage: civagent skills <region/regime-id> [--stats]"; exit 1; }
   local dir="$REGIMES_DIR/$regime/skills"
   if [[ ! -d "$dir" ]]; then
     echo "No learned skills yet for $regime."
@@ -270,6 +286,23 @@ cmd_skills() {
     echo -e "  ${GREEN}•${NC} $(basename "$f")"
     python3 -c "import re,sys;t=open(sys.argv[1]).read();m=re.search(r'description:\s*(.+)',t);print('    '+m.group(1)) if m else None" "$f" 2>/dev/null
   done
+  if $stats; then
+    echo ""
+    echo -e "${BOLD}Quality analysis${NC}"
+    node --input-type=module <<EOF
+import { analyzeSkillsDir } from '$ENGINE_DIR/v5/skill-quality.mjs';
+const r = analyzeSkillsDir('$dir');
+console.log('  Total skills     : ' + r.total);
+console.log('  Unique topics    : ' + r.uniqueTopics.length);
+if (r.stats.duplicateCount) console.log('  Near-duplicates  : ' + r.stats.duplicateCount);
+if (r.stats.firstSedimented) console.log('  First sedimented : ' + r.stats.firstSedimented);
+if (r.stats.lastSedimented)  console.log('  Last sedimented  : ' + r.stats.lastSedimented);
+if (r.duplicateGroups.length) {
+  console.log('  Duplicate groups :');
+  r.duplicateGroups.forEach(g => console.log('    ' + g.join(', ')));
+}
+EOF
+  fi
 }
 
 cmd_match_log() {
@@ -286,20 +319,51 @@ cmd_match_log() {
   done
 }
 
+cmd_replay() {
+  local match_id="${1:-}"
+  [[ -n "$match_id" ]] || { echo "Usage: civagent replay <matchId>"; exit 1; }
+  exec node --input-type=module <<EOF
+import { replayMatch } from '$ENGINE_DIR/v5/replay.mjs';
+replayMatch('$match_id').then(({ newMatchId, exitCode }) => {
+  console.error('[replay] done — new match: ' + newMatchId);
+  process.exit(exitCode ?? 0);
+}).catch(e => { console.error('[replay] error:', e.message); process.exit(1); });
+EOF
+}
+
 cmd_tournament() {
   local civs=""
+  local multi_judge=false
+  local judges_n=""
+  local prompt_bank=false
+  local seed=""
   local rest=()
   while [[ $# -gt 0 ]]; do
     case "$1" in
-      --civs) civs="$2"; shift 2 ;;
-      *) rest+=("$1"); shift ;;
+      --civs)         civs="$2"; shift 2 ;;
+      --multi-judge)  multi_judge=true; shift ;;
+      --judges)       judges_n="$2"; shift 2 ;;
+      --prompt-bank)  prompt_bank=true; shift ;;
+      --seed)         seed="$2"; shift 2 ;;
+      *)              rest+=("$1"); shift ;;
     esac
   done
-  if [[ -z "$civs" || ${#rest[@]} -eq 0 ]]; then
+  if [[ -z "$civs" ]]; then
+    echo "Usage: civagent tournament --civs a/x,b/y,c/z [--multi-judge] [--judges N] [--prompt-bank] [--seed N] \"task\""
+    exit 1
+  fi
+  if ! $prompt_bank && [[ ${#rest[@]} -eq 0 ]]; then
     echo "Usage: civagent tournament --civs a/x,b/y,c/z \"task prompt\""
+    echo "       civagent tournament --civs a/x,b/y,c/z --prompt-bank"
     exit 1
   fi
-  exec node "$ENGINE_DIR/v5/tournament.mjs" --civs "$civs" "${rest[@]}"
+  local node_args=("$ENGINE_DIR/v5/tournament.mjs" --civs "$civs")
+  $multi_judge  && node_args+=(--multi-judge)
+  [[ -n "$judges_n" ]] && node_args+=(--judges "$judges_n")
+  $prompt_bank  && node_args+=(--prompt-bank)
+  [[ -n "$seed" ]] && node_args+=(--seed "$seed")
+  node_args+=("${rest[@]}")
+  exec node "${node_args[@]}"
 }
 
 # ── main ─────────────────────────────────────────────────────────────────────
@@ -312,8 +376,9 @@ case "${1:-}" in
   agents)    cmd_agents ;;
   modes)     cmd_modes ;;
   setup)     cmd_setup ;;
-  skills)    cmd_skills "${2:-}" ;;
+  skills)    shift; cmd_skills "${1:-}" "${@:2}" ;;
   match-log) cmd_match_log ;;
+  replay)    cmd_replay "${2:-}" ;;
   tournament) shift; cmd_tournament "$@" ;;
   help|--help|-h|"") usage ;;
   *)         echo "Unknown command: $1"; usage; exit 1 ;;

diff --git a/engine/prompts/governance-scenarios.json b/engine/prompts/governance-scenarios.json
@@ -0,0 +1,52 @@
+[
+  {
+    "id": "frontier-defense-01",
+    "category": "military",
+    "prompt": "Establish secure and robust border defense policies for agricultural frontiers facing seasonal tribal raids."
+  },
+  {
+    "id": "succession-crisis-01",
+    "category": "political",
+    "prompt": "The head of state has died without a designated heir. Multiple factions claim legitimacy. Design a governance process to resolve the succession crisis without civil war."
+  },
+  {
+    "id": "economic-collapse-01",
+    "category": "economic",
+    "prompt": "A sudden 40% drop in tax revenue threatens the state treasury. Propose emergency fiscal measures that preserve core governance functions while minimising unrest."
+  },
+  {
+    "id": "external-threat-01",
+    "category": "diplomacy",
+    "prompt": "A neighbouring power demands territorial concessions under threat of war. Your military capacity is roughly equal. Design a multi-channel response strategy."
+  },
+  {
+    "id": "internal-revolt-01",
+    "category": "internal",
+    "prompt": "A major province is threatening secession after three consecutive years of drought and perceived neglect from the centre. Draft a reconciliation and relief plan."
+  },
+  {
+    "id": "technological-disruption-01",
+    "category": "innovation",
+    "prompt": "A new military technology is spreading rapidly among rivals. Design an adoption strategy and a counter-proliferation policy that fits your governance structure."
+  },
+  {
+    "id": "plague-response-01",
+    "category": "crisis",
+    "prompt": "An epidemic is spreading through major cities, killing 5–10% of the population per month. Coordinate an immediate public-health response within your governance structure."
+  },
+  {
+    "id": "trade-route-disruption-01",
+    "category": "economic",
+    "prompt": "The main trade route has been cut off by a rival power. Design a diversification strategy to maintain economic stability within 12 months."
+  },
+  {
+    "id": "famine-relief-01",
+    "category": "crisis",
+    "prompt": "Crop failure across three provinces threatens mass starvation. Allocate grain reserves, coordinate relief logistics, and prevent hoarding and speculation."
+  },
+  {
+    "id": "religious-schism-01",
+    "category": "social",
+    "prompt": "A doctrinal split has divided the state's official religion. Factions are mobilising followers against each other. Draft a policy to maintain civil order and state authority."
+  }
+]