Problem
/health currently returns:
{"status":"healthy","database":"connected"}
This reports "healthy" even when a bank has failed operations, stuck consolidation, or other operational issues. I had 24 failed consolidation operations and 37 pending items — /health gave no indication. The implementation (memory_engine.py:health_check()) only runs SELECT 1 against the database.
Proposal
Include operational health signals in the response:
consolidation_status: ok / degraded / stuck
failed_operations_count: total across all banks
pending_consolidation_count: total across all banks
last_consolidation_at: timestamp of most recent successful consolidation
Example:
{
"status": "healthy",
"database": "connected",
"consolidation_status": "degraded",
"failed_operations": 24,
"pending_consolidation": 37,
"last_consolidation_at": "2026-03-29T01:57:06Z"
}
If adding fields to /health is undesirable (some monitoring systems expect a simple 200/503), a separate /health/detailed endpoint would also work.
Use case
Automated monitoring scripts and audit workflows that need to detect operational degradation without querying each bank's /stats individually.
Problem
/healthcurrently returns:{"status":"healthy","database":"connected"}This reports "healthy" even when a bank has failed operations, stuck consolidation, or other operational issues. I had 24 failed consolidation operations and 37 pending items —
/healthgave no indication. The implementation (memory_engine.py:health_check()) only runsSELECT 1against the database.Proposal
Include operational health signals in the response:
consolidation_status: ok / degraded / stuckfailed_operations_count: total across all bankspending_consolidation_count: total across all bankslast_consolidation_at: timestamp of most recent successful consolidationExample:
{ "status": "healthy", "database": "connected", "consolidation_status": "degraded", "failed_operations": 24, "pending_consolidation": 37, "last_consolidation_at": "2026-03-29T01:57:06Z" }If adding fields to
/healthis undesirable (some monitoring systems expect a simple 200/503), a separate/health/detailedendpoint would also work.Use case
Automated monitoring scripts and audit workflows that need to detect operational degradation without querying each bank's
/statsindividually.