Skip to content

feat: hot-reload mechanism for geaflow-infer module.#786

Open
aotenjou wants to merge 6 commits intoapache:masterfrom
aotenjou:master
Open

feat: hot-reload mechanism for geaflow-infer module.#786
aotenjou wants to merge 6 commits intoapache:masterfrom
aotenjou:master

Conversation

@aotenjou
Copy link
Copy Markdown

What changes were proposed in this pull request?

How was this PR tested?

This pull request introduces a comprehensive hot-reload mechanism for model inference in the Geaflow system. The changes add new configuration keys, extend the Java inference context and data exchange logic, and implement a robust hot-reload workflow in the Python inference session. These improvements allow for dynamic model updates with configurable polling intervals, backoff strategies, and optional warmup, increasing the flexibility and reliability of model deployment.

Hot-reload configuration and integration:

  • Added new configuration keys in FrameworkConfigKeys to control model hot-reload behavior, including model path, version file, polling interval, backoff, warmup, and enable flags.
  • Updated InferContext to read and pass these hot-reload settings as parameters to the inference process, integrating them into the command invocation.
  • Extended InferEnvironmentContext to generate the appropriate command-line arguments for the new hot-reload parameters.

Resource management improvements:

  • Refactored DataExchangeContext and DataExchangeQueue to use instance-level AtomicBoolean flags for safe resource cleanup, and improved shutdown hook handling to ensure native memory is released exactly once.

Python inference session hot-reload logic:

  • Implemented a new hot-reload workflow in inferSession.py using a background thread to monitor the model version manifest, reload models as needed, handle failures with backoff, and optionally perform warmup. This design supports atomic model swaps and robust error handling.

These changes collectively enable dynamic, reliable, and configurable hot-reloading of inference models in Geaflow.

  • Tests have Added for the changes
  • Production environment verified

…time

Implement blue-green hot swap in TorchInferSession with throttled version polling,
async single-flight loading, warmup-before-switch, and rollback with backoff so
workers can safely adopt newly published models without request interruption.
@aotenjou aotenjou marked this pull request as ready for review April 17, 2026 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant