Skip to content

Propagate proxy env to engine-generic workers via engine-cm#393

Merged
asafyehezkel merged 1 commit into
masterfrom
engine-job-proxy-env
Jun 28, 2026
Merged

Propagate proxy env to engine-generic workers via engine-cm#393
asafyehezkel merged 1 commit into
masterfrom
engine-job-proxy-env

Conversation

@asafyehezkel

Copy link
Copy Markdown
Contributor

Follow-up to #392 (which covered the pippin push-job init container). This extends proxy propagation to the engine-generic worker pods and the rest of the engine workloads.

Why

The engine process spawns engine-generic worker pods itself (as a Deployment, create_namespaced_deployment in the engine repo's deployment_manager.py), and those pods source their env from the engine-cm ConfigMap via envFromnot from the engine container's own env. So putting proxy vars on the engine container would not reach the workers; the correct single source is engine-cm.

What

Add HTTP_PROXY/HTTPS_PROXY/NO_PROXY (upper + lower case) to engine-cm, rendered only when set. One edit reaches every envFrom: engine-cm consumer:

  • engine main job container
  • engine-generic worker Deployment (created by engine code)
  • engine-orchestrator
  • copy-deps post-deploy job

In-pod os.fork() children inherit the pod env automatically. Reuses the existing engine proxy values (GetEngineProxyEnv.Values.http_proxy/https_proxy/no_proxy) introduced in #392no installer change and no engine-code change (the worker Deployment references engine-cm by name, so it picks up the keys at runtime). The pippin init container keeps its explicit proxy env (it has no envFrom).

NO_PROXY is already augmented with in-cluster targets (zot registry, minio, .svc, .cluster.local, 10.42/10.43 CIDRs) so workers' in-cluster traffic bypasses the proxy.

Validation

  • helm template: engine-cm has 0 proxy keys by default; all 6 keys render (inside the engine-cm ConfigMap) when proxy values are set.
  • go build, gofmt, and full go test ./... all pass.
  • Chart bumps: tensorleap 1.6.35→1.6.36, tensorleap-engine 1.0.615→1.0.616.

🤖 Generated with Claude Code

The engine spawns engine-generic worker pods (as a Deployment) by code,
and those pods source their env from the engine-cm ConfigMap via envFrom
(deployment_manager.py), not from the engine container's own env. So add
HTTP(S)_PROXY/NO_PROXY to engine-cm: one edit reaches the engine main
container, engine-generic workers, engine-orchestrator and the copy-deps
job (all envFrom: engine-cm), and the in-pod os.fork() children inherit
it. Reuses the existing engine proxy values (GetEngineProxyEnv); no
installer or engine-code change needed. The pippin init container keeps
its explicit proxy env since it has no envFrom.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
GENERIC_CALCULATOR_ENTRY_POINT: {{ .Values.entry_point }}
GENERIC_ACTIVATE_NFS: "{{ .Values.metrics_activate_nfs }}"
GENERIC_HOST_PATH: {{ .Values.localDataDirectories | join ":" }}
{{- if .Values.http_proxy }}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one is http ant the second is https

@roytl roytl left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@asafyehezkel asafyehezkel merged commit 57be351 into master Jun 28, 2026
@asafyehezkel asafyehezkel deleted the engine-job-proxy-env branch June 28, 2026 14:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants