<<<<<<< HEAD
=======
A trustworthy ML system that combines machine learning with Lean 4 formal verification to ensure every prediction is safe, valid, and mathematically guaranteed.
- Project Overview
- Problem Statement
- Solution Architecture
- Formal Specification
- Lean 4 Integration & Correctness Guarantees
- ML Pipeline
- Constraint Validation & Rule Engine
- Real-World Relevance
- System Flow
- Tech Stack
- Getting Started
- Example Output
- Creativity: Applying Lean in Autonomy
- Future Work
- Team
Invariant AI is a constraint-aware risk prediction system that solves a critical gap in modern AI: ML models can produce statistically plausible but logically invalid outputs.
This system wraps a standard ML risk scorer with:
- Domain constraint validation โ rule-based checks that catch invalid predictions before they propagate
- Lean 4 formal verification โ mathematical proof that every returned prediction satisfies defined invariants (e.g.,
0 โค risk โค 1) - Automatic correction โ invalid predictions are corrected and re-verified, not silently passed through
The result is a pipeline where every output comes with a formal proof of correctness โ transforming a black-box model into a trustworthy, auditable AI system.
Standard ML pipelines have no mechanism to guarantee that outputs satisfy domain constraints:
- A risk model might return
risk = 1.4โ mathematically impossible in a [0, 1] bounded score - A model might assign high risk to inputs with logically impossible combinations of flags
- There is no way to prove a prediction is valid โ only to test it on samples
In high-stakes domains (healthcare, finance, autonomous systems), this is not acceptable.
Invariant AI solves this by making correctness a first-class output โ not an afterthought.
User Input (age, risk_flags, ...)
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ ML Risk Scorer โ โ Trained model generates raw risk score
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Constraint Engine โ โ Rule-based validation & auto-correction
โ (Domain Rules) โ enforces: 0 โค risk โค 1, flag consistency, etc.
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ Lean 4 Verifier โ โ Formal proof that prediction satisfies invariants
โ (Formal Proofs) โ returns: verified_score + proof object
โโโโโโโโโโโโฌโโโโโโโโโโโโ
โ
โผ
โ
Verified Prediction + Formal Proof Certificate
Lean 4 is a dependently-typed theorem prover and programming language. Unlike runtime assertions, Lean proofs are:
- Static โ verified at compile time, not at runtime
- Compositional โ complex invariants built from simpler proven lemmas
- Machine-checkable โ the Lean kernel independently checks every proof
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ User Input โ
โ (age, risk flag, etc.) โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ ML Risk Model โ
โ (Logistic Regression) โ
โ โ raw risk score (p) โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Constraint Validator โ
โ - Check 0 โค p โค 1 โ
โ - Check flag rules โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ
(if violation detected)
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Correction Engine โ
โ - Adjust risk value โ
โ - Enforce domain rules โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Lean 4 Verifier โ
โ Formal Proof Generation โ
โ (0 โค p โค 1 is proven) โ
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Final Verified Output โ
โ - Risk Score โ
โ - Validation Status โ
โ - Proof Certificate โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
- ML Model โ Generates initial prediction (may be imperfect)
- Validator โ Detects logical inconsistencies
- Correction Layer โ Fixes invalid outputs instead of rejecting
- Lean Verifier โ Ensures mathematical correctness
- Final Output โ Fully validated + provable result
- Built a risk prediction model using Python (scikit-learn / boosting models)
- Uses features like:
- age
- risk flags (critical, chronic, history)
- Produces a raw risk score from the model
| Feature | Type | Description |
|---|---|---|
age |
Integer | Age of the individual |
flag |
Binary (0/1) | High-risk indicator |
- Algorithm: Logistic Regression
- Output: Probability score in [0, 1]
from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)
risk = model.predict_proba(X_input)[:, 1][0]
---
---
### โ
Constraint Engine
```markdown
The system enforces logical constraints on predictions:
```python
def correct_prediction(age, risk, flag):
if age > 60 and risk < 0.2:
risk = 0.2
if flag == 1 and risk < 0.5:
risk = 0.5
return max(0, min(risk, 1))
## โ๏ธ Constraint Validation & Rule Engine
The constraint engine sits between the ML model and the Lean verifier. It:
1. **Validates** the raw score against all domain invariants
2. **Corrects** violations using domain-aware logic (not clipping)
3. **Logs** every correction for auditability
## ๐ Real-World Relevance
### Where This Matters
| Domain | Risk Without Invariant AI | With Invariant AI |
|--------|--------------------------|-------------------|
| **Healthcare** | Model outputs `risk = 1.3` โ patient overtreated | Proof that score โ [0,1] before clinical use |
| **Finance** | Credit risk score violates regulatory bounds | Formally verified compliance with Basel III constraints |
| **Autonomous Vehicles** | Safety classifier outputs inconsistent with sensor flags | Lean proof of consistency before actuation |
| **Insurance** | Premium calculation violates actuarial invariants | Audit-ready proof certificate per prediction |
### Regulatory Alignment
- **EU AI Act (2024)**: Emphasizes transparency and reliability โ this system improves trust via validation and proof
- **FDA SaMD Guidelines**: Supports bounded and predictable outputs
- **ISO 26262 (Automotive)**: Aligns with the idea of verifying safety-critical outputs
## ๐ System Flow
1. User submits input โ {age: 67, flag: 1}
2. ML Model generates raw score โ 0.32
3. Constraint Engine checks:
โ flag=1 but score=0.32 < 0.5 โ CORRECTION โ score = 0.50
โ 0 โค 0.50 โค 1 โ PASS
4. Lean Verifier receives score=0.50:
theorem: 0 โค 0.50 โง 0.50 โค 1 โ PROVED โ
5. Response returned:
{
"risk": 0.50,
"valid": true,
"corrected": true,
"proof": "โ Lean Proof: 0.50 satisfies 0 โค p โค 1"
}| Layer | Technology |
|---|---|
| ML Model | Python, scikit-learn (Logistic Regression) |
| Constraint Engine | Python (custom rules) |
| Formal Verifier | Lean 4 |
| UI | Streamlit |
| Integration | Python โ Lean subprocess |
# Install Lean 4
curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | sh
# Install Python dependencies
pip3 install -r requirements.txt# Start the API
uvicorn main:app --reload
# Example request
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"age": 67, "flag_critical": true, "flag_chronic": false}'cd lean/
lake build
lake testcd invariantai lake env lean --run Invariantai/Check.lean 70
โ Example output:
โ Lean Proof: 0.7 satisfies 0 โค p โค 1
{
"risk": 0.50,
"valid": true,
"corrected": true,
"message": "Constraint applied",
"proof": "โ Proof: 0.50 satisfies 0 โค p โค 1"
}Most ML systems treat formal verification as a post-hoc audit tool. Invariant AI treats it as a live gatekeeper.
Key creative contributions:
- Proof-as-output: The system doesn't just predict โ it produces a proof certificate alongside every result. This is a new paradigm for ML APIs.
- Constraint-aware correction: Instead of rejecting invalid predictions (which breaks pipelines), the system corrects them and re-verifies โ making it robust in production.
- Lean as a runtime component: Lean is invoked in the request/response loop, not just at test time. This is an unusual and powerful application of a theorem prover.
- Composable invariants: Complex domain rules are broken into simple Lean lemmas that compose โ this mirrors how formal methods work in safety-critical hardware design, now applied to AI.
- End-to-end Lean proof โ eliminate the Python/Lean bridge and run the full pipeline in Lean 4 natively
- Proof caching โ cache proofs for repeated input patterns to reduce latency
- LLM-assisted spec generation โ use an LLM to auto-generate Lean invariants from natural language domain descriptions
- Multi-model ensemble verification โ verify that ensemble disagreements don't violate consistency invariants
- Dashboard โ real-time monitoring of correction rates, proof success rates, and invariant violation frequency
| Name | Role |
|---|---|
| Priyashree Panda | ML Engineer + Lean Verifier |
MIT License โ see LICENSE for details.
"A prediction without a proof is just a guess. Invariant AI makes every guess provable."
This project was built during the LeanLang for Verified Autonomy Hackathon (April 17โ18 + online through May 1, 2026) at the Indian Institute of Science (IISc), Bangalore.
The goal of the hackathon was to explore how formal methods (Lean 4) can be integrated with real-world systems to improve reliability, safety, and trust in AI.
Invariant AI was developed as a practical system combining:
- Machine Learning (risk prediction)
- Constraint validation (domain rules)
- Lean 4 formal verification (proof of correctness)
This project was made possible by:
- Emergence AI โ Hackathon sponsor
- Emergence India Labs โ Event organizer and research direction
- Indian Institute of Science (IISc), Bangalore โ Academic partner, mentorship, and technical guidance
- Hackathon Page: https://east.emergence.ai/hackathon-april2026.html
- Emergence India Labs: https://east.emergence.ai
- Emergence AI: https://www.emergence.ai
This project demonstrates a new paradigm:
๐ AI systems that donโt just predict โ but prove their correctness
By integrating Lean 4 into the ML pipeline, this system ensures:
- Predictions are always within valid bounds
- Logical constraints are enforced
- Outputs are accompanied by formal proofs
This approach moves AI systems closer to being:
- Trustworthy
- Auditable
- Safe for real-world deployment
๐ Built for Verified AI Systems of the Future