fix vl mem leak by hhaAndroid · Pull Request #1645 · InternLM/xtuner

hhaAndroid · 2026-03-31T08:27:21Z

This PR fixes memory leaks in VL (vision-language) data loading by:

Using context managers for Image.open() and io.BytesIO() to ensure file handles are closed promptly
Adding a trim_memory() helper that calls malloc_trim(0) to return freed heap pages to the OS after processing multimodal/video items

hhaAndroid · 2026-03-31T08:55:24Z

@claude review

xtuner/v1/datasets/mllm_tokenize_fn/base_mllm_tokenize_fn.py

xtuner/v1/datasets/mllm_tokenize_fn/qwen3_vl_utils.py

xtuner/v1/datasets/mllm_tokenize_fn/base_mllm_tokenize_fn.py

claude · 2026-03-31T08:57:53Z

Claude: ## Summary

This PR fixes memory leaks in VL (vision-language) data loading by:

Using context managers for Image.open() and io.BytesIO() to ensure file handles are closed promptly
Adding a trim_memory() helper that calls malloc_trim(0) to return freed heap pages to the OS after processing multimodal/video items

The resource cleanup changes (context managers) are correct and good practice. A few concerns:

Issues

Warning

Incorrect type hint on trim_memory: The logger parameter is typed as logging.Logger but get_logger() returns a loguru.Logger. This will fail strict mypy checks.
Inverted dependency: trim_memory is a general utility but lives in qwen3_vl_utils.py. The base class module (base_mllm_tokenize_fn.py) should not import from a model-specific utils module. Consider moving it to a shared utils location.

Nit

Per-item malloc_trim overhead: trim_memory is called on every multimodal/video item. Worth benchmarking whether this syscall overhead matters at scale, or whether periodic trimming would be more efficient.
PR body is empty — a brief description of the root cause and fix would help reviewers.

Verdict

COMMENT — the core fix (context managers) is solid. The trim_memory utility needs the type hint fix and should ideally live in a shared module.

nil0x9 · 2026-04-04T03:14:06Z

xtuner/v1/utils/misc.py

+_TRIM_MEMORY_WARNED = False
+
+
+def trim_memory(logger: logging.Logger | None = None):


logger in function arg seems weird. looks like we want to trim memory induced by logger per se. under what circumstances do we need a logger other than the global loguru logger returned by xtuner.v1.utils.get_logger?

nil0x9 · 2026-04-04T03:17:53Z

xtuner/v1/datasets/mllm_tokenize_fn/base_mllm_tokenize_fn.py

        data_name: str | None = None,
        llm_pack_weight: float = 1.0,
        visual_pack_weight: float = 0.0,
+        trim_memory_step: int = 1,


might be better if we stick to trainer naming convention -- use ..._step as counter, and ..._interval for specifying frequency of doing things.

hhaAndroid added 2 commits March 31, 2026 08:26

fix vl mem leak

85508af

fix

481dd5a

claude bot reviewed Mar 31, 2026

View reviewed changes

xtuner/v1/datasets/mllm_tokenize_fn/base_mllm_tokenize_fn.py Outdated Show resolved Hide resolved

claude bot reviewed Mar 31, 2026

View reviewed changes

xtuner/v1/datasets/mllm_tokenize_fn/qwen3_vl_utils.py Outdated Show resolved Hide resolved

claude bot reviewed Mar 31, 2026

View reviewed changes

xtuner/v1/datasets/mllm_tokenize_fn/base_mllm_tokenize_fn.py Show resolved Hide resolved

refine

29ff875

nil0x9 reviewed Apr 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix vl mem leak#1645

fix vl mem leak#1645
hhaAndroid wants to merge 3 commits intoInternLM:mainfrom
hhaAndroid:fix_vl_mem_leak

hhaAndroid commented Mar 31, 2026 •

edited

Loading

Uh oh!

hhaAndroid commented Mar 31, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude bot commented Mar 31, 2026

Uh oh!

nil0x9 Apr 4, 2026

Uh oh!

nil0x9 Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		_TRIM_MEMORY_WARNED = False


		def trim_memory(logger: logging.Logger \| None = None):

Conversation

hhaAndroid commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hhaAndroid commented Mar 31, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

claude bot commented Mar 31, 2026

Issues

Warning

Nit

Verdict

Uh oh!

nil0x9 Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

nil0x9 Apr 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hhaAndroid commented Mar 31, 2026 •

edited

Loading