Skip to content

feat: add Hugging Face fetch timeout flag#33

Open
rupayon123 wants to merge 5 commits into
pipe1os:mainfrom
rupayon123:codex/add-hf-timeout-flag
Open

feat: add Hugging Face fetch timeout flag#33
rupayon123 wants to merge 5 commits into
pipe1os:mainfrom
rupayon123:codex/add-hf-timeout-flag

Conversation

@rupayon123

@rupayon123 rupayon123 commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Adds a --timeout flag for remote Hugging Face fetches. The default stays at 10 seconds, and the value is passed through the metadata, config, index, shard header, and HEAD requests.

I also added CLI coverage for the default/custom timeout values, invalid values, and the remote analysis path passing the timeout into the Hugging Face parser.

I couldn't run the test suite from this environment, so please treat the tests as unverified locally.

Closes #27

Summary by CodeRabbit

Release Notes

  • New Features

    • Added a --timeout CLI option to control the timeout for remote Hugging Face model downloads.
    • Timeout defaults to 10.0 seconds and must be a strictly positive value.
  • Tests

    • Expanded test coverage for --timeout, including default behavior, valid float handling, and rejection of invalid values.
    • Added an integration-style test to confirm the timeout is passed through to Hugging Face fetching.

@coderabbitai

coderabbitai Bot commented Jun 20, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ff287b37-d221-48d7-a20d-ba19926f763b

📥 Commits

Reviewing files that changed from the base of the PR and between bd6ce16 and 539fe49.

📒 Files selected for processing (2)
  • src/modelinfo/cli.py
  • tests/test_cli.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/test_cli.py
  • src/modelinfo/cli.py

Walkthrough

Adds a --timeout CLI option (default 10.0, validated as a positive float) to the modelinfo CLI. The value is threaded through analyze_model into fetch_huggingface_repo, which propagates it to _make_request and _fetch_safetensors_header, replacing all hardcoded 10-second timeouts in urllib calls. Tests cover argument parsing and end-to-end forwarding.

Changes

--timeout flag for remote Hugging Face fetches

Layer / File(s) Summary
CLI argument, validation, and analyze_model wiring
src/modelinfo/cli.py
Adds _positive_float validator, registers --timeout (default 10.0) in parse_args, extends analyze_model signature with timeout: float = 10.0, passes it to fetch_huggingface_repo, and wires args.timeout into both single-model and multi-model call sites in main().
Timeout propagation through huggingface.py network calls
src/modelinfo/parsers/huggingface.py
Adds timeout parameter to _make_request (forwarded to urlopen), _fetch_safetensors_header (including the 416-retry path), and all request sites in fetch_huggingface_repo: API metadata, config.json, sharded index, per-shard header closures, and single-file HEAD + header fetch.
CLI parsing and forwarding tests
tests/test_cli.py
Adds four --timeout parse tests (default, float, zero-rejection, negative-rejection) and one integration test that monkeypatches fetch_huggingface_repo to assert the timeout value is correctly forwarded by analyze_model.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add Hugging Face fetch timeout flag' clearly summarizes the main change: adding a timeout flag for Hugging Face operations.
Description check ✅ Passed The description provides a clear summary, motivation (linking to issue #27), change type, and test coverage details. It addresses most required template sections except explicit checklist items.
Linked Issues check ✅ Passed The PR fully implements issue #27 requirements: adds --timeout argument with 10.0 default, threads it through analyze_model to fetch_huggingface_repo, replaces hardcoded timeout with parameter.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the --timeout feature. No unrelated modifications to other functionality detected across cli.py, huggingface.py parser, and test files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

req = urllib.request.Request(url, headers=headers)
try:
with urllib.request.urlopen(req, timeout=10) as response:
with urllib.request.urlopen(req, timeout=timeout) as response:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential user input in HTTP request may allow SSRF attack - medium severity
If an attacker can control the URL input leading into this HTTP request, the attack might be able to perform an SSRF attack. This kind of attack is even more dangerous if the application returns the response of the request to the user. It could allow them to retrieve information from higher privileged services within the network (such as the metadata service, which is commonly available in cloud services, and could allow them to retrieve credentials).

Show fix

Remediation: If possible, only allow requests to allowlisting domains. If not, consult the article linked above to learn about other mitigating techniques such as disabling redirects, blocking private IPs and making sure private services have internal authentication. If you return data coming from the request to the user, validate the data before returning it to make sure you don't return random data.

Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

Comment thread src/modelinfo/cli.py Outdated

def _positive_float(value: str) -> float:
fvalue = float(value)
if fvalue <= 0:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_positive_float allows NaN because fvalue <= 0 is false for NaN, so invalid --timeout nan passes parsing and can break request timeout handling.

Details

✨ AI Reasoning
​The new timeout validator is meant to enforce a strictly positive value, but it only checks whether the parsed float is less than or equal to zero. A NaN value bypasses that condition because NaN is neither <= 0 nor > 0. That means an invalid timeout can pass argument parsing and propagate into request logic, where timeout handling may raise runtime errors. This is a control-flow validation bug in the new logic.

🔧 How do I fix it?
Trace execution paths carefully. Ensure precondition checks happen before using values, validate ranges before checking impossible conditions, and don't check for states that the code has already ruled out.

Reply @AikidoSec feedback: [FEEDBACK] to get better review comments in the future.
Reply @AikidoSec ignore: [REASON] to ignore this issue.
More info

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/modelinfo/cli.py`:
- Around line 44-48: The _positive_float function in src/modelinfo/cli.py
currently accepts non-finite values like nan and inf, which can cause issues
downstream. Add a check after converting the string to float using
math.isfinite() to validate that the value is finite, and raise
argparse.ArgumentTypeError with an appropriate message if it is not (for
example, "timeout must be a finite number"). This validation should occur
alongside the existing check for positive values to ensure all invalid timeout
values are rejected during argument parsing.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: f2f3d51d-d292-418f-9b0a-bd62151f8f38

📥 Commits

Reviewing files that changed from the base of the PR and between 99a7a5f and bd6ce16.

📒 Files selected for processing (3)
  • src/modelinfo/cli.py
  • src/modelinfo/parsers/huggingface.py
  • tests/test_cli.py

Comment thread src/modelinfo/cli.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Expose a --timeout flag for remote Hub fetching

1 participant