Wire per-job shared volume and blob-mode flags into engine chart#383
Open
asafyehezkel wants to merge 7 commits into
Open
Wire per-job shared volume and blob-mode flags into engine chart#383asafyehezkel wants to merge 7 commits into
asafyehezkel wants to merge 7 commits into
Conversation
added 7 commits
June 4, 2026 12:23
Expose JOB_SHARED_MAX_CONCURRENT_JOBS / JOB_SHARED_DISK_FRACTION / JOB_SHARED_DISK_MIN_FREE_GB via engine-cm so the disk-bound queue-cap budget and the disk-free guard floor can be tuned per node (bigger nodes run more concurrent jobs). Defaults (2 / 0.6 / 2) match the in-code defaults; only used when payload_store=blob.
The orchestrator now prunes per-job shared dirs leaked by hard-killed jobs, so it needs the shared-volume base mounted (k3d single-node hostPath at /job-shared).
REDIS_BLOB_POD_GB / REDIS_BLOB_MAXMEMORY_MB (default 1 GiB / 768 MiB), the cluster default for per-job Redis in blob mode. A per-job pod_memory_override (via node-server's engine-redis-settings) still wins.
6e7c421 to
f2e9348
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Helm side of the hybrid Redis->storage data plane (pairs with engine PR tensorleap/engine#2332).
PAYLOAD_STOREto engine-cm (defaultredis;bloboffloads heavy queue payloads to the volume and Redis carries slim references).job_shared_max_concurrent_jobs,job_shared_disk_fraction,job_shared_disk_min_free_gb(used only whenpayload_store=blob).Defaults preserve current behavior (
payload_store=redis). On-prem chart only; cloud (engine/helm-chart, EFS) is a follow-up.