Embedder model router#2232
Conversation
Greptile SummaryAdds local-checkpoint support to the embedder stack: on-disk model directories bypass HF revision pinning via an explicit
|
| Filename | Overview |
|---|---|
| nemo_retriever/src/nemo_retriever/models/inference/processor.py | Import typo: nemo_retriever.model (singular) should be nemo_retriever.models — causes ModuleNotFoundError at runtime on the local-embedder injection path |
| nemo_retriever/src/nemo_retriever/models/init.py | Adds local-checkpoint routing helpers and wires model_arch into create_local_embedder/create_local_query_embedder; _is_local_checkpoint_dir lacks the config.json guard present in hf_model_registry._is_local_model_dir |
| nemo_retriever/src/nemo_retriever/models/hf_model_registry.py | Adds allow_local_path opt-in to get_hf_revision with a config.json guard; clean and well-tested |
| nemo_retriever/tests/test_local_embed_checkpoint.py | Comprehensive new test file covering revision-bypass, text/VL routing, env/arg overrides, and Hub-ID regression guard; fixture isolation is clean |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[create_local_embedder / maybe_inject_local_hf_embedder] --> B[resolve_embed_model_use_vl\nmodel_name, model_arch]
B --> C{_is_local_checkpoint_dir?}
C -- Yes --> D[_resolve_local_embed_arch\nmodel_arch or NRL_LOCAL_EMBED_ARCH]
D -- invalid/missing --> E[ValueError: must declare arch]
D -- vl --> F[use_vl = True]
D -- text --> G[use_vl = False]
C -- No --> H[is_vl_embed_model\nHub allow-list]
H --> F
H --> G
F --> I[VL embedder\nLlamaNemotronEmbedVL1BV2*]
G --> J[Text embedder\nLlamaNemotronEmbed1BV2*]
I --> K[get_hf_revision\nallow_local_path=True]
J --> K
K --> L{_is_local_model_dir\nconfig.json present?}
L -- Yes --> M[return None\nno revision pin]
L -- No + registered Hub ID --> N[return pinned SHA]
L -- No + unregistered --> O[ValueError: no pinned revision]
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[create_local_embedder / maybe_inject_local_hf_embedder] --> B[resolve_embed_model_use_vl\nmodel_name, model_arch]
B --> C{_is_local_checkpoint_dir?}
C -- Yes --> D[_resolve_local_embed_arch\nmodel_arch or NRL_LOCAL_EMBED_ARCH]
D -- invalid/missing --> E[ValueError: must declare arch]
D -- vl --> F[use_vl = True]
D -- text --> G[use_vl = False]
C -- No --> H[is_vl_embed_model\nHub allow-list]
H --> F
H --> G
F --> I[VL embedder\nLlamaNemotronEmbedVL1BV2*]
G --> J[Text embedder\nLlamaNemotronEmbed1BV2*]
I --> K[get_hf_revision\nallow_local_path=True]
J --> K
K --> L{_is_local_model_dir\nconfig.json present?}
L -- Yes --> M[return None\nno revision pin]
L -- No + registered Hub ID --> N[return pinned SHA]
L -- No + unregistered --> O[ValueError: no pinned revision]
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 1
nemo_retriever/src/nemo_retriever/models/inference/processor.py:93-97
The import uses `nemo_retriever.model` (singular), but the package is `nemo_retriever.models` (plural). There is no `nemo_retriever/model.py` module in the repo, so this raises `ModuleNotFoundError` at runtime on every call to `maybe_inject_local_hf_embedder` when no remote endpoint is configured — i.e., the exact case this function is meant to handle.
```suggestion
from nemo_retriever.models import (
create_local_embedder,
resolve_embed_model,
resolve_embed_model_use_vl,
)
```
Reviews (3): Last reviewed commit: "Add local embed checkpoint routing" | Re-trigger Greptile
| @@ -59,6 +60,36 @@ def is_vl_rerank_model(model_name: str | None) -> bool: | |||
| return (model_name or "") in _VL_RERANK_MODEL_IDS | |||
|
|
|||
|
|
|||
| LOCAL_EMBED_ARCH_ENV = "NRL_LOCAL_EMBED_ARCH" | |||
| _VALID_LOCAL_EMBED_ARCHS = frozenset({"vl", "text"}) | |||
There was a problem hiding this comment.
Why is this still here? Shouldn't this go away if we accept universal?
There was a problem hiding this comment.
seems like these might be architectures? Please elaborate, on why this distinction would still be required.
There was a problem hiding this comment.
For this implementation I wasn't planning on accepting universal embedding models for now but rather focusing on either vl or text that nvidia already has on huggingface.
We would need this explicit distinction in the case to avoid routing a VL checkpoint through the text embedder and vice versa.
3e6a91f to
21b9df8
Compare
| from nemo_retriever.model import ( | ||
| create_local_embedder, | ||
| resolve_embed_model, | ||
| resolve_embed_model_use_vl, | ||
| ) |
There was a problem hiding this comment.
The import uses
nemo_retriever.model (singular), but the package is nemo_retriever.models (plural). There is no nemo_retriever/model.py module in the repo, so this raises ModuleNotFoundError at runtime on every call to maybe_inject_local_hf_embedder when no remote endpoint is configured — i.e., the exact case this function is meant to handle.
| from nemo_retriever.model import ( | |
| create_local_embedder, | |
| resolve_embed_model, | |
| resolve_embed_model_use_vl, | |
| ) | |
| from nemo_retriever.models import ( | |
| create_local_embedder, | |
| resolve_embed_model, | |
| resolve_embed_model_use_vl, | |
| ) |
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/models/inference/processor.py
Line: 93-97
Comment:
The import uses `nemo_retriever.model` (singular), but the package is `nemo_retriever.models` (plural). There is no `nemo_retriever/model.py` module in the repo, so this raises `ModuleNotFoundError` at runtime on every call to `maybe_inject_local_hf_embedder` when no remote endpoint is configured — i.e., the exact case this function is meant to handle.
```suggestion
from nemo_retriever.models import (
create_local_embedder,
resolve_embed_model,
resolve_embed_model_use_vl,
)
```
How can I resolve this? If you propose a fix, please make it concise.
Description
Adds local checkpoint support for embedders by allowing local model directories to bypass HF revision pinning only through an explicit allow_local_path=True opt-in.
Routes local checkpoints through the existing NRL text or VL embedder paths using
NRL_LOCAL_EMBED_ARCH=vl|text, without adding a new operator or changing extraction/VDB/eval architecture.Centralizes text vs VL routing in a shared helper so the factory and text embedding processor stay in sync.
Adds tests covering local checkpoint revision behavior, text/VL routing, env/arg overrides, and unchanged registered Hub ID behavior.
Checklist