fix(ruvLLM): P4.1 wire trainer backprop endpoint gradients by Stricttype · Pull Request #416 · ruvnet/RuVector

Stricttype · 2026-05-02T19:17:30Z

Summary

Follow-up to #414. Fixes the structural blocker on Acceptance #4 ("perplexity better than random-init baseline") by wiring actual gradient updates into Trainer::train_epoch.

Bug: train_epoch computed compute_loss but never called optimizer.step against any weight matrices — model parameters never changed regardless of training duration.

Approach: endpoint-only backprop (Option A)

Hand-rolled analytical gradients flow through cross-entropy → lm_head → final RMSNorm, with a coarse identity-bypass for embeddings. Body (attention QKV, FFN, per-layer norms) remains random-init, treated as a fixed feature extractor.

~150 LOC, no new deps. Public API unchanged: Trainer::{new, train, train_epoch, into_model, metrics_history, save_checkpoint_periodic, model} and TrainableModel::{save/load_checkpoint, from/to_checkpoint, to_q4, forward, compute_loss} keep their signatures. forward_with_cache is purely additive.

Acceptance evidence

Test	Result
`cargo check` (default)	pass
`cargo check --features real-inference`	pass
`cargo test --lib --features real-inference`	105 / 105
`cargo test --test wiki_pipeline_test --features real-inference`	6 / 6
Unit `test_train_epoch_decreases_loss` (≥10 steps, synthetic)	pass
Unit `test_train_epoch_updates_lm_head` (mutation sanity)	pass
Integration `test_perplexity_5pct_floor_with_backprop`	pass

Perplexity (fixture, 2 epochs, lr=1e-2): 51.736 → 45.504 (ratio 0.880, 12% reduction — well past 5% floor).

Caveats / future work

Embedding gradient uses an identity-bypass approximation (descent direction correct, biased magnitude) — adequate for small corpora. If scaling to full Wikipedia exposes the bias, follow-up could either backprop through the body or drop embedding updates.
Body weights (wq/wk/wv/wo/w1/w2/w3 + per-layer norms) intentionally remain at random-init. Endpoint-only by design.
output_norm gradient recovers hidden/rms from normed/g; defensive guard handles the rare zero case.

Test plan

CI green default features
CI green --features real-inference
No regression on --features persistence (orthogonal patch)

Depends on / built on top of #414.

🤖 Generated with Claude Code

- examples/ruvLLM/Cargo.lock - examples/ruvLLM/Cargo.toml - examples/ruvLLM/docs/api-reference.md - examples/ruvLLM/docs/code-standards.md - examples/ruvLLM/docs/codebase-summary.md - examples/ruvLLM/docs/configuration-guide.md - examples/ruvLLM/docs/deployment-guide.md - examples/ruvLLM/docs/handoffs/2026-05-02-1943-auto.md - examples/ruvLLM/docs/project-overview-pdr.md - examples/ruvLLM/docs/system-architecture.md - examples/ruvLLM/docs/testing-guide.md - examples/ruvLLM/learn/260502-1900-init-ruvllm/learn-results.tsv - examples/ruvLLM/learn/260502-1900-init-ruvllm/summary.md Co-Authored-By: Pi Coding Agent <pi@localhost>

- examples/ruvLLM/Cargo.lock - examples/ruvLLM/config/example.toml - examples/ruvLLM/config/pretrain.toml - examples/ruvLLM/scripts/fetch-simple-wiki.sh - examples/ruvLLM/src/bin/pretrain.rs - examples/ruvLLM/src/bin/sidecar.rs - examples/ruvLLM/src/config.rs - examples/ruvLLM/src/lib.rs - examples/ruvLLM/src/sona/mod.rs - examples/ruvLLM/src/sona/persist.rs - examples/ruvLLM/src/training.rs - examples/ruvLLM/tests/persist_integration.rs Co-Authored-By: Pi Coding Agent <pi@localhost>

- rename src/data/ → src/corpus/ (gitignore conflict: data/ pattern blocks Rust source) - add corpus module: wiki corpus iter, tokenizer wrapper, tokenized dataset - add tests/wiki_pipeline_test.rs (5/5 PASS) - surgical fixes for pre-existing candle 0.8 API drift in src/inference_real.rs - add From<candle_core::Error> shim in src/error.rs (unblocks --features real-inference) - extend src/training.rs: DatasetSource trait, ModelCheckpoint serde, save_checkpoint, measure_baseline_perplexity P4 status: DONE_WITH_CONCERNS — pre-existing issues surfaced: - Trainer::train computes loss but does not call optimizer.step (no backprop) → perplexity-delta is structurally 0% until follow-up patch - SmallTransformer lacks from_checkpoint() constructor → trained checkpoints are saved but to_q4_weights() re-randomizes; follow-up needed to load saved checkpoint into inference path - TokenizerWrapper::from_pretrained stubbed (requires tokenizers/http feature, not currently enabled); inline whitespace WordLevel fallback works for offline pilot Smoke tests: - cargo check: PASS - cargo check --features persistence: PASS - cargo check --features real-inference: PASS - cargo check --features persistence,real-inference: PASS - cargo test --features persistence --test persist_integration: 4/4 PASS - cargo test --features real-inference --test wiki_pipeline_test: 5/5 PASS

Co-Authored-By: Pi Coding Agent <pi@localhost>

Trainer::train_epoch now computes analytical gradients for the output endpoint (cross-entropy → lm_head → RMSNorm) and applies optimizer.step to lm_head, output_norm, and embeddings. Transformer body remains a fixed feature extractor (endpoint-only approximation, Option A). - TrainableModel::forward_with_cache exposes the LM-head input (normed) required for analytical gradients without changing forward()'s signature. - Per-batch gradient accumulation with averaging + global L2 clipping. - TrainingMetrics.grad_norm now populated from accumulated batch norms. - Public Trainer API (new/train/train_epoch/into_model/metrics_history/ save_checkpoint_periodic/model) unchanged. Tests: - src/training.rs::tests::test_train_epoch_decreases_loss — ≥10 steps on synthetic data, asserts final < initial loss. - src/training.rs::tests::test_train_epoch_updates_lm_head — verifies optimizer actually mutates lm_head between epochs. - tests/wiki_pipeline_test.rs::test_perplexity_5pct_floor_with_backprop — fixture corpus, ≥1 epoch, asserts final_ppl < 0.95 * initial_ppl. Observed: 51.74 → 45.50 (ratio 0.880, 12% improvement). - Existing test_perplexity_better_than_random tightened from non- regression (≤2.0) to improvement (<0.95). cargo check (default + --features real-inference): pass cargo test --lib --features real-inference: 105 passed cargo test --test wiki_pipeline_test --features real-inference: 6 passed

Crew Worker and others added 5 commits May 2, 2026 20:41

feat(examples): p4 1 trainer backprop

9be3979

Co-Authored-By: Pi Coding Agent <pi@localhost>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ruvLLM): P4.1 wire trainer backprop endpoint gradients#416

fix(ruvLLM): P4.1 wire trainer backprop endpoint gradients#416
Stricttype wants to merge 5 commits intoruvnet:mainfrom
Stricttype:feat/ruvllm-p4-1-backprop

Stricttype commented May 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Stricttype commented May 2, 2026

Summary

Approach: endpoint-only backprop (Option A)

Acceptance evidence

Caveats / future work

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant