Skip to content

feat(zeam): flip --chain-worker default to on (c-2c part 2 burn-in)#171

Merged
ch4r10t33r merged 1 commit intomainfrom
feat/zclawz/chain-worker-default-on
May 6, 2026
Merged

feat(zeam): flip --chain-worker default to on (c-2c part 2 burn-in)#171
ch4r10t33r merged 1 commit intomainfrom
feat/zclawz/chain-worker-default-on

Conversation

@zclawz
Copy link
Copy Markdown
Contributor

@zclawz zclawz commented May 6, 2026

Per @ch4r10t33r's call after the c-2b/c-2c merge (blockblaz/zeam#828): turn the chain-worker on by default in the deployed config so the burn-in actually exercises the prod path.

What

File Before After
client-cmds/zeam-cmd.sh ZEAM_CHAIN_WORKER:- (default empty → no flag) ZEAM_CHAIN_WORKER-on (default on)
ansible/roles/zeam/defaults/main.yml zeam_chain_worker: "" zeam_chain_worker: "on"

The shell side uses ${VAR-on} (no colon) deliberately: the colon form would also overwrite an explicitly-empty ZEAM_CHAIN_WORKER= with on, leaving operators no way to suppress the flag against older zeam builds. The non-colon form lets ZEAM_CHAIN_WORKER= (explicit empty) emit no flag — useful as an escape hatch.

Override behavior (all four cases)

# 1. Default — emits --chain-worker on (the prod path post-c-2b)
unset ZEAM_CHAIN_WORKER && ./spin-node.sh
# 2. Kill-switch — emits --chain-worker off (legacy synchronous path)
export ZEAM_CHAIN_WORKER=off
# 3. Suppress entirely (older zeam without the flag) — emits nothing
export ZEAM_CHAIN_WORKER=
# 4. Invalid value — WARN logged, no flag emitted
export ZEAM_CHAIN_WORKER=hello

Ansible side mirrors:

# 1. Default
ansible-playbook ... # picks up zeam_chain_worker: "on"
# 2. Kill-switch
ansible-playbook ... -e zeam_chain_worker=off
# 3. Suppress (against older image)
ansible-playbook ... -e zeam_chain_worker=

Required image

This PR REQUIRES blockblaz/zeam:devnet4 >= v0.4.14 (the c-2b release). Older images (v0.4.13 / pre-c-1) do not recognise --chain-worker at all and will fail to start; if you need to deploy this PR against an older image, the ZEAM_CHAIN_WORKER= / zeam_chain_worker: "" escape hatch suppresses the flag entirely.

(v0.4.14's --chain-worker is a presence-only bool — simargs sets it to true and ignores any value token. So --chain-worker on works as "enable", and --chain-worker off would ALSO enable, treating off as a stray positional. If you need the kill-switch, use the ZEAM_CHAIN_WORKER= empty form to suppress the flag entirely on v0.4.14. A future zeam release with a value-taking enum would let --chain-worker off actually disable.)

Burn-in target (per zeam #803 c-2c plan)

  • ≥24h with --chain-worker on
  • Watch zeam_lock_hold_seconds{lock="states", site="onBlock.commit"} p99 — should drop dramatically vs slice (b) baseline (rwlock no longer load-bearing under chain-worker single-writer regime).
  • Watch lean_chain_state_refcount_distribution — typical=1, occasional 2-4, never >16 (entries stuck >16 indicate a leaked reader acquire).
  • Watch lean_chain_queue_dropped_total — should be 0 under nominal load; non-zero means producer/consumer mismatch worth investigating.

Validation

$ bash -n client-cmds/zeam-cmd.sh && echo OK
OK

# unset env  → emits `--chain-worker on` ✅
# =off       → emits `--chain-worker off` ✅
# =''        → emits no `--chain-worker` flag ✅
# =hello     → WARN logged, no flag emitted ✅

…-in)

Per @ch4r10t33r's call: shipping the chain-worker default-off would
mean the burn-in is exactly testing the path nobody actually runs
in prod. Flip both deployment paths' default to on:

  * client-cmds/zeam-cmd.sh: ZEAM_CHAIN_WORKER default '' → 'on'.
    Uses ${VAR-on} (no colon) so an explicit ZEAM_CHAIN_WORKER=
    can still suppress the flag for older zeam builds; the colon
    form would also overwrite the empty value.

  * ansible/roles/zeam/defaults/main.yml: zeam_chain_worker default
    '' → 'on'. Override via -e zeam_chain_worker=off (or in
    inventory group_vars) for the kill-switch path; set to '' to
    suppress entirely.

Matches the zeam compiled-in default flip in blockblaz/zeam #830
(--chain-worker enum {on, off}, default .on). REQUIRES a zeam
build with chain-worker support, i.e. blockblaz/zeam:devnet4 >=
v0.4.15. v0.4.14 has a broken bool CLI shape (#830 fixes that)
and v0.4.13 doesn't recognise --chain-worker at all; against
either, set ZEAM_CHAIN_WORKER= or zeam_chain_worker: '' to
suppress the flag.

Verified all four script branches:
  * unset env → emits --chain-worker on (the new default) ✅
  * ZEAM_CHAIN_WORKER=off → emits --chain-worker off (kill-switch) ✅
  * ZEAM_CHAIN_WORKER='' → emits no flag (older-zeam compat) ✅
  * ZEAM_CHAIN_WORKER=hello → WARN logged, no flag emitted ✅
Copy link
Copy Markdown
Contributor

@ch4r10t33r ch4r10t33r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@ch4r10t33r ch4r10t33r merged commit 202a977 into main May 6, 2026
4 checks passed
@ch4r10t33r ch4r10t33r deleted the feat/zclawz/chain-worker-default-on branch May 6, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants