fix: treat Scrypt WS Connection closed as transient by TaprootFreak · Pull Request #3655 · DFXswiss/api

TaprootFreak · 2026-04-30T10:06:46Z

Summary

Fixes liquidity-management pipelines being permanently FAILED on transient Scrypt WebSocket disconnects.

When the Scrypt WS drops, all pending requests are rejected with new Error('Connection closed'). This was surfaced as OrderFailedException in ScryptAdapter.checkTradeCompletion, marking the order Failed → action 233 has no onFail → pipeline FAILED → rule auto-paused → mail. Meanwhile the underlying order on Scrypt is unaffected and the funds are not moved.

Changes

scrypt-websocket-connection.ts: Extract the retry logic added in Various improvements #3594 into a private helper retryOnTransientWsError and apply it to fetch (was previously only on fetchAll). fetch is used by fetchExecutionReports and fetchOrderBook, both on the hot path of checkTrade.
scrypt.adapter.ts: In checkTradeCompletion, classify Connection closed / unknown reqid as transient → return false so the order stays IN_PROGRESS and is retried on the next 10s cron tick. Genuine errors still throw OrderFailedException.

Repro / data point

Pipeline 60738 (2026-04-30 07:21 UTC, rule 313 Scrypt/EUR redundancy)
Order 122805 (sell 70'003.98 EUR → USDT): 5 ClOrdIds in 3 min (4 edits), final WS drop → Connection closed → wrongly marked Failed
Balance audit on Scrypt: EUR was never spent (next pipeline 60741 sold the same EUR + 40k more)
20+ similar incidents since 2026-03-24, all matching this pattern

Test plan

CI green
After deploy: monitor next Scrypt redundancy pipeline; if WS drops mid-check, verify order stays IN_PROGRESS and resumes (vs. flipping to Failed)
Verify no double-execution: getOrderStatus cache + 30-day fallback in scrypt.service.ts:301 already dedupes by ClOrdID, so retry-on-next-tick reuses the existing correlation
Optional: tail logs for Retrying fetch ... after transient error and Transient WS error checking order to gauge frequency

The Scrypt WebSocket adapter rejects all pending requests with 'Connection closed' when the WS disconnects. Previously this surfaced as a permanent OrderFailedException in the liquidity management pipeline, causing the rule to be paused even though the underlying order on Scrypt was still alive (no fill, no money moved). Two changes: - ScryptWebSocketConnection: extend the 'fetchAll' retry pattern to also cover 'fetch'. Refactor the retry logic into a shared helper so any future fetch-style call gets the same treatment. - ScryptAdapter.checkTradeCompletion: when the underlying error is a transient WS error (Connection closed / unknown reqid), return false instead of throwing OrderFailedException, so the order stays IN_PROGRESS and is retried on the next cron tick. Reproduced via pipeline 60738 (rule 313, Scrypt/EUR redundancy): order 122805 went through 5 ClOrdIds in 3 minutes before WS dropped during a check, was wrongly marked Failed; balance audit confirmed the EUR were never spent.

…pter Centralize the transient WS error markers ('Connection closed' / 'unknown reqid') and a shared isTransientWsError helper in scrypt-websocket-connection. Both retryOnTransientWsError and the ScryptAdapter check now use the same function, eliminating the duplicated string list and aligning the case-insensitive matching with isBalanceTooLowError elsewhere in the adapter.

TaprootFreak marked this pull request as ready for review April 30, 2026 10:37

TaprootFreak requested a review from davidleomay as a code owner April 30, 2026 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: treat Scrypt WS Connection closed as transient#3655

fix: treat Scrypt WS Connection closed as transient#3655
TaprootFreak wants to merge 2 commits intodevelopfrom
fix/scrypt-ws-connection-resilience

TaprootFreak commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TaprootFreak commented Apr 30, 2026

Summary

Changes

Repro / data point

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant