You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Refactor the AML decision pipeline so that every value of decision-driving enums (RiskStatus, UserDataStatus, KycStatus, PhoneCallStatus, …) is forced — at compile time — to map to an explicit AML outcome. Today the logic is a long if-chain over getter properties (isBlocked, isRiskBlocked, isSuspicious, …), which makes it possible to add a new enum value without any handler. PR #3638 fixed one concrete instance of this class of bug, but the architectural gap that allowed it remains.
This issue proposes a layered fix: a Record<Enum, AmlError | null> pattern enforced by tsc --noEmit (already in CI as npm run type-check), plus a Kuma-based production invariant monitor as a universal backstop.
Background — the bug that triggered this
PR #2389 (DEV-4326 userData RiskStatus, merged 2025-08-27) introduced a SQL-level filter in the buy-crypto and buy-fiat AML preparation services:
The intent — keep suspicious users out of the auto-approval path — was correct. The implementation was not: there was no corresponding rule in `AmlHelperService.getAmlErrors` that would assign `SUSPICIOUS` users an AML outcome. Result: any `buy_crypto` / `buy_fiat` whose user is flagged `Suspicious` between deposit arrival and the AML cron tick gets silently filtered out, ends up with `amlCheck = NULL` and `status = Created`, and stays there indefinitely.
Concrete production incident: `buy_crypto 120973` (35'000 CHF, userDataId 383865, transaction 312001, bank_tx 197696), stuck since 2026-04-28 07:29 UTC. It was the only entry in the entire DB with `amlCheck IS NULL AND status = 'Created'`. The bug had been latent in develop for ~8 months before it hit a real customer.
The 14-day expired-pending reaper in `aml-helper.service.ts:642` does not cover this case — it only acts on entries already in `PENDING`, never on `null`.
PR #3638 fixed the specific symptom by adding a `USER_DATA_SUSPICIOUS` rule that mirrors the existing `USER_DATA_BLOCKED` handling. But the same anti-pattern (filter at query level + missing handler in the rule chain) can occur in other places and for other enums. This issue addresses the architectural cause.
Why current testing did not catch it
No test exists for `doAmlCheck` that exercises each `RiskStatus` value.
The only AML test (`aml-helper.service.spec.ts`) covers a single feature flag (`isTrustedReferrer`).
TypeScript already enforces exhaustiveness for `AmlError → AmlErrorResult` via the mapped type `{ [b in AmlError]: ... }` — but no equivalent enforcement exists for `RiskStatus → AmlError` (or any other input enum). Adding a new enum value compiles cleanly even if no handler is defined.
The bug class is invisible to unit tests because the failure is ambient (entry never updated) rather than an exception.
Proposed solution — layered
Tier 1 — Record-map + compile-time exhaustiveness for AML input enums (this issue)
Replace the current ad-hoc if-chain in `AmlHelperService.getAmlErrors` (`aml-helper.service.ts:78–82` and follow-ups at lines 221, 391) with explicit `Record<Enum, AmlError | null>` maps:
```ts
// src/subdomains/core/aml/enums/aml-policy.ts (new file)
import { AmlError } from './aml-error.enum';
import { RiskStatus, UserDataStatus, KycStatus } from '...';
export const UserDataStatusAmlError: Record<UserDataStatus, AmlError | null> = {
[UserDataStatus.NA]: null,
[UserDataStatus.ACTIVE]: null,
[UserDataStatus.BLOCKED]: AmlError.USER_DATA_BLOCKED,
[UserDataStatus.DEACTIVATED]: AmlError.USER_DATA_DEACTIVATED,
// ...every value must be present, or tsc fails
};
```
`getAmlErrors` then becomes:
```ts
const riskError = RiskStatusAmlError[entity.userData.riskStatus];
if (riskError) errors.push(riskError);
const statusError = UserDataStatusAmlError[entity.userData.status];
if (statusError) errors.push(statusError);
```
The `Record<Enum, X>` type forces every enum member to appear in the map — a missing key is a `tsc` error. The existing `AmlErrorResult` mapping (`aml-error.enum.ts:91`) already proves the pattern works in this codebase and is idiomatic.
Flow-specific variants (`isRiskBuyCryptoBlocked`, `isRiskBuyFiatBlocked` at lines 221/391) require either two separate maps (`RiskStatusBuyCryptoAmlError`, `RiskStatusBuyFiatAmlError`) or a single `Record<RiskStatus, { generic, buyCrypto, buyFiat }>` map. Recommend the second — keeps all riskStatus knowledge in one place.
Tier 2 — Apply the same pattern to other AML input enums
Once the pattern is in place for `RiskStatus`, extend to:
`UserDataStatus` (currently `isBlocked`, `isDeactivated`, `isPaymentStatusEnabled` getters in `user-data.entity.ts`)
`PhoneCallStatus` (currently nested ternaries at `aml-helper.service.ts:70–76` and `120–126` — already hard to read, easy to break on enum change)
`AccountType` if it gates AML decisions
Each migration follows the same shape: extract an `AmlError | null` Record, replace the matching getter call in `getAmlErrors`, delete the now-unused getter if it has no other callers.
Add a Kuma push monitor (or a scheduled API job) that fires when:
```sql
SELECT COUNT(*) FROM buy_crypto
WHERE amlCheck IS NULL AND status = 'Created' AND created < DATEADD(hour, -2, GETDATE());
-- and identical for buy_fiat
```
returns `> 0`. This is independent of the type system — it catches any future filter-without-handler bug, in any pipeline, in any service, even ones the type-level work above does not cover. ~30 min of setup, recurring zero cost.
What we explicitly do not propose
Mutation testing (Stryker): too slow for per-PR CI, requires substantially more existing test coverage to be useful.
Property-based testing (fast-check): large initial investment in fixture infrastructure; premature given current AML test coverage.
Pure parametric runtime tests as the only defense: the repo's AML test setup is sparse and fragile fixtures would push the bug detection back to test-quality questions rather than guarantees.
Implementation plan
Create `src/subdomains/core/aml/enums/aml-policy.ts` with `RiskStatusAmlError` (covering generic + per-flow variants).
Refactor `AmlHelperService.getAmlErrors` (`aml-helper.service.ts:78–82`) to consume the map. Drop the if-chain for risk-status and the duplicated checks at lines 221, 391.
Verify all callers of `UserData.isSuspicious`, `isRiskBlocked`, `isRiskBuyCryptoBlocked`, `isRiskBuyFiatBlocked` getters and decide whether to keep the getters (used by Auth guard `user-active.guard.ts`) or inline. Default: keep the getters, they read well at call sites that are not the AML rule chain.
Repeat for `UserDataStatus` and `KycStatus` (separate commits, easier review).
Add the Kuma monitor for stuck `amlCheck IS NULL` rows. Coordinate with infra — see status.dfx.swiss / the Kuma instance referenced internally.
Update `CONTRIBUTING.md` — add a short section under "AML / decision logic" stating: "All decision-driving enums must dispatch via a `Record<Enum, …>` map; never via SQL-level filtering of enum values."
Acceptance criteria
`RiskStatusAmlError` (or equivalent) exists; adding a hypothetical new `RiskStatus` value fails `npm run type-check` until handled.
`AmlHelperService.getAmlErrors` no longer contains the SQL-level enum filter pattern (`Not(RiskStatus.X)` etc.) for AML input dispatch.
At least one decision-driving enum besides `RiskStatus` (`UserDataStatus` recommended first) migrated to the same pattern.
Kuma alarm exists for `amlCheck IS NULL AND status = 'Created' AND created < now()-2h` on `buy_crypto` and `buy_fiat`. Triggering it in DEV produces an alert.
CONTRIBUTING.md documents the dispatch convention.
No regression: `buy_crypto 120973` and analogous cases continue to receive a non-null `amlCheck` after deploy.
Summary
Refactor the AML decision pipeline so that every value of decision-driving enums (
RiskStatus,UserDataStatus,KycStatus,PhoneCallStatus, …) is forced — at compile time — to map to an explicit AML outcome. Today the logic is a long if-chain over getter properties (isBlocked,isRiskBlocked,isSuspicious, …), which makes it possible to add a new enum value without any handler. PR #3638 fixed one concrete instance of this class of bug, but the architectural gap that allowed it remains.This issue proposes a layered fix: a
Record<Enum, AmlError | null>pattern enforced bytsc --noEmit(already in CI asnpm run type-check), plus a Kuma-based production invariant monitor as a universal backstop.Background — the bug that triggered this
PR #2389 (DEV-4326 userData RiskStatus, merged 2025-08-27) introduced a SQL-level filter in the buy-crypto and buy-fiat AML preparation services:
The intent — keep suspicious users out of the auto-approval path — was correct. The implementation was not: there was no corresponding rule in `AmlHelperService.getAmlErrors` that would assign `SUSPICIOUS` users an AML outcome. Result: any `buy_crypto` / `buy_fiat` whose user is flagged `Suspicious` between deposit arrival and the AML cron tick gets silently filtered out, ends up with `amlCheck = NULL` and `status = Created`, and stays there indefinitely.
Concrete production incident: `buy_crypto 120973` (35'000 CHF, userDataId 383865, transaction 312001, bank_tx 197696), stuck since 2026-04-28 07:29 UTC. It was the only entry in the entire DB with `amlCheck IS NULL AND status = 'Created'`. The bug had been latent in develop for ~8 months before it hit a real customer.
The 14-day expired-pending reaper in `aml-helper.service.ts:642` does not cover this case — it only acts on entries already in `PENDING`, never on `null`.
PR #3638 fixed the specific symptom by adding a `USER_DATA_SUSPICIOUS` rule that mirrors the existing `USER_DATA_BLOCKED` handling. But the same anti-pattern (filter at query level + missing handler in the rule chain) can occur in other places and for other enums. This issue addresses the architectural cause.
Why current testing did not catch it
Proposed solution — layered
Tier 1 — Record-map + compile-time exhaustiveness for AML input enums (this issue)
Replace the current ad-hoc if-chain in `AmlHelperService.getAmlErrors` (`aml-helper.service.ts:78–82` and follow-ups at lines 221, 391) with explicit `Record<Enum, AmlError | null>` maps:
```ts
// src/subdomains/core/aml/enums/aml-policy.ts (new file)
import { AmlError } from './aml-error.enum';
import { RiskStatus, UserDataStatus, KycStatus } from '...';
export const RiskStatusAmlError: Record<RiskStatus, AmlError | null> = {
[RiskStatus.NA]: null,
[RiskStatus.RELEASED]: null,
[RiskStatus.BLOCKED]: AmlError.USER_DATA_BLOCKED,
[RiskStatus.SUSPICIOUS]: AmlError.USER_DATA_SUSPICIOUS,
[RiskStatus.BLOCKED_BUY_CRYPTO]: AmlError.USER_DATA_BLOCKED, // overridden per-flow, see below
[RiskStatus.BLOCKED_BUY_FIAT]: AmlError.USER_DATA_BLOCKED,
};
export const UserDataStatusAmlError: Record<UserDataStatus, AmlError | null> = {
[UserDataStatus.NA]: null,
[UserDataStatus.ACTIVE]: null,
[UserDataStatus.BLOCKED]: AmlError.USER_DATA_BLOCKED,
[UserDataStatus.DEACTIVATED]: AmlError.USER_DATA_DEACTIVATED,
// ...every value must be present, or tsc fails
};
```
`getAmlErrors` then becomes:
```ts
const riskError = RiskStatusAmlError[entity.userData.riskStatus];
if (riskError) errors.push(riskError);
const statusError = UserDataStatusAmlError[entity.userData.status];
if (statusError) errors.push(statusError);
```
The `Record<Enum, X>` type forces every enum member to appear in the map — a missing key is a `tsc` error. The existing `AmlErrorResult` mapping (`aml-error.enum.ts:91`) already proves the pattern works in this codebase and is idiomatic.
Flow-specific variants (`isRiskBuyCryptoBlocked`, `isRiskBuyFiatBlocked` at lines 221/391) require either two separate maps (`RiskStatusBuyCryptoAmlError`, `RiskStatusBuyFiatAmlError`) or a single `Record<RiskStatus, { generic, buyCrypto, buyFiat }>` map. Recommend the second — keeps all riskStatus knowledge in one place.
Tier 2 — Apply the same pattern to other AML input enums
Once the pattern is in place for `RiskStatus`, extend to:
Each migration follows the same shape: extract an `AmlError | null` Record, replace the matching getter call in `getAmlErrors`, delete the now-unused getter if it has no other callers.
Tier 3 — Production invariant monitor (parallel, low cost, universal backstop)
Add a Kuma push monitor (or a scheduled API job) that fires when:
```sql
SELECT COUNT(*) FROM buy_crypto
WHERE amlCheck IS NULL AND status = 'Created' AND created < DATEADD(hour, -2, GETDATE());
-- and identical for buy_fiat
```
returns `> 0`. This is independent of the type system — it catches any future filter-without-handler bug, in any pipeline, in any service, even ones the type-level work above does not cover. ~30 min of setup, recurring zero cost.
What we explicitly do not propose
Implementation plan
Acceptance criteria
References