Skip to content

Add a dedicated OpenAI-compatible LLM adapter#1895

Open
jimmyzhuu wants to merge 5 commits intoZipstack:mainfrom
jimmyzhuu:codex/openai-compatible-llm-adapter
Open

Add a dedicated OpenAI-compatible LLM adapter#1895
jimmyzhuu wants to merge 5 commits intoZipstack:mainfrom
jimmyzhuu:codex/openai-compatible-llm-adapter

Conversation

@jimmyzhuu
Copy link
Copy Markdown

Summary

This PR adds a dedicated OpenAI Compatible LLM adapter for OpenAI-style chat completion endpoints that are not the official OpenAI service.

The implementation is intentionally small in scope:

  • adds a new OpenAI Compatible LLM adapter backed by LiteLLM's custom_openai path
  • keeps the existing OpenAI adapter unchanged
  • adds adapter registration and JSON schema
  • adds focused tests for registration, model normalization, schema loading, and usage recording
  • documents the new adapter in the README

Why

Users may already have access to OpenAI-compatible endpoints behind a private gateway or third-party provider, but the current OpenAI adapter is specifically shaped around official OpenAI semantics.

Using a separate adapter keeps those semantics explicit and avoids broadening the meaning of the existing OpenAI adapter.

Refs #1894
Refs #856
Refs #1443

Scope

This PR is limited to:

  • LLM only
  • no embedding changes
  • no x2text / OCR changes
  • no changes to the existing OpenAI adapter behavior

Notes

For custom models that are not mapped in LiteLLM's token metadata, usage recording now falls back gracefully instead of failing while estimating prompt tokens.

This keeps the adapter usable for custom OpenAI-compatible endpoints without changing pricing semantics in this PR.

Validation

  • UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run pytest tests/test_openai_compatible_adapter.py
  • UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run ruff check src/unstract/sdk1/adapters/base1.py src/unstract/sdk1/adapters/llm1/__init__.py src/unstract/sdk1/adapters/llm1/openai_compatible.py src/unstract/sdk1/llm.py tests/test_openai_compatible_adapter.py

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 4, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c35a8b53-2cee-4f8f-8bc2-54148529da44

📥 Commits

Reviewing files that changed from the base of the PR and between 5090773 and d3e1cad.

📒 Files selected for processing (3)
  • unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
  • unstract/sdk1/src/unstract/sdk1/llm.py
  • unstract/sdk1/tests/test_openai_compatible_adapter.py
✅ Files skipped from review due to trivial changes (1)
  • unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • unstract/sdk1/src/unstract/sdk1/llm.py
  • unstract/sdk1/tests/test_openai_compatible_adapter.py

Summary by CodeRabbit

  • New Features

    • Added comprehensive support for OpenAI-compatible LLM providers, enabling flexible integration with OpenAI API-compatible services through customizable API endpoints and authentication credentials.
  • Bug Fixes

    • Enhanced error resilience for token counting operations to gracefully handle edge cases with unmapped models, ensuring usage tracking continues without interruption.
  • Documentation

    • Restructured LLM providers reference table in README.

Walkthrough

Adds support for OpenAI-compatible LLM providers: new OpenAICompatibleLLMAdapter and OpenAICompatibleLLMParameters, a JSON configuration schema, README provider-table updates, safer token-counting in usage recording, and tests for registration, validation, schema, and usage behavior.

Changes

Cohort / File(s) Summary
Documentation
README.md
Updated the "LLM Providers" table: replaced left-column entries with "OpenAI Compatible", reordered right-column providers, and removed one right-column provider cell leaving a blank status cell.
Parameter Validation
unstract/sdk1/src/unstract/sdk1/adapters/base1.py
Added OpenAICompatibleLLMParameters with api_key, api_base, validate() and validate_model() to normalize model names to custom_openai/{model}.
Adapter Implementation
unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py, unstract/sdk1/src/unstract/sdk1/adapters/llm1/__init__.py
Added OpenAICompatibleLLMAdapter with static ID/metadata, provider custom_openai, icon, and exported it via package __all__.
Configuration Schema
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
Added JSON schema defining required adapter_name and api_base, optional api_key, model (default gpt-4o-mini), max_tokens, max_retries, and timeout.
Usage / Error Handling
unstract/sdk1/src/unstract/sdk1/llm.py
Changed _record_usage to read prompt_tokens from provider usage when present; call token_counter() only if missing and catch exceptions (set prompt_tokens=0 and log a warning) so usage push proceeds.
Tests
unstract/sdk1/tests/test_openai_compatible_adapter.py
Added tests covering adapter registration, parameter validation/model normalization, schema contents, and usage recording behavior including token-counter failure handling.

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant Adapter as OpenAI Compatible Adapter
    participant Validator as Parameter Validator
    participant LLM as LLM Instance
    participant TokenCounter as Token Counter
    participant Audit as Audit System

    App->>Adapter: create/instantiate with config
    Adapter->>Validator: validate(adapter_metadata)
    Validator->>Validator: normalize model -> custom_openai/{model}
    Validator-->>Adapter: return validated params
    App->>LLM: trigger _record_usage(usage_data)
    alt usage_data contains prompt_tokens
        LLM-->>TokenCounter: (skipped)
    else usage_data missing prompt_tokens
        LLM->>TokenCounter: token_counter(model, messages)
        alt TokenCounter succeeds
            TokenCounter-->>LLM: prompt token count
        else TokenCounter raises
            TokenCounter-->>LLM: exception
            LLM->>LLM: catch, set prompt_tokens = 0, log warning
        end
    end
    LLM->>Audit: push_usage_data(with prompt_tokens)
    Audit-->>LLM: acknowledge
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description covers most required sections: Summary, Why, and Scope are well-documented. However, critical sections like 'Can this PR break any existing features' and 'Notes on Testing' from the template are missing. Add explicit 'Can this PR break existing features' section (confirming no impact to existing OpenAI adapter) and expand 'Notes on Testing' with test command details and expected outcomes.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Add a dedicated OpenAI-compatible LLM adapter' accurately and concisely summarizes the main change—introducing a new OpenAI-compatible LLM adapter while keeping the existing OpenAI adapter unchanged.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 4, 2026

Greptile Summary

This PR introduces a dedicated OpenAICompatibleLLMAdapter backed by LiteLLM's custom_openai path, allowing users to connect to OpenAI-style chat-completion endpoints that are not the official OpenAI service. It also adds a graceful fallback in _record_usage so that token_counter failures for unmapped custom models no longer surface as exceptions.

  • New adapter (openai_compatible.py, base1.py): Clean implementation following the existing class hierarchy (BaseChatCompletionParametersOpenAICompatibleLLMParametersOpenAICompatibleLLMAdapter). The validate_model method correctly guards against double-prefixing with custom_openai/.
  • JSON schema (custom_openai.json): api_key is rightly optional (some private gateways do not require one); model ships with a sensible gpt-4o-mini default and the schema is well-formed.
  • _record_usage change (llm.py): The fallback path when token_counter raises is correct; usage data is still pushed to Audit with prompt_tokens=0 when estimation is impossible. The final usage_data.get("prompt_tokens", prompt_tokens or 0) expression is redundant (see inline comment) but functionally safe.
  • Tests: Cover registration, model-prefix normalisation, schema loading, and both branches of the new token_counter fallback logic. Well-structured with the previously-flagged concerns addressed (JSON parsed instead of raw-string matched, magic stub explained).
  • Minor: _load_llm_class() is defined but never called in the test file (dead code), and the usage_data.get in _record_usage can be simplified to prompt_tokens or 0 — both are P2 style items only.

Confidence Score: 5/5

Safe to merge — no P0 or P1 issues found; all remaining findings are style-level P2.

The new adapter follows existing codebase patterns exactly, the JSON schema and Pydantic model are consistent, the _record_usage fallback is logically correct, and the test suite covers the key behaviours. Only minor dead code and a redundant dict lookup were found.

No files require special attention. test_openai_compatible_adapter.py has a small dead-code item (_load_llm_class), and llm.py has a trivially redundant expression that could be cleaned up.

Important Files Changed

Filename Overview
unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py New adapter class — clean implementation following existing patterns; all required static methods present
unstract/sdk1/src/unstract/sdk1/adapters/base1.py Adds OpenAICompatibleLLMParameters with correct custom_openai/ prefix guard and optional api_key
unstract/sdk1/src/unstract/sdk1/llm.py Adds graceful token_counter fallback in _record_usage; logic is correct but final usage_data.get call is redundant
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json Well-formed schema; api_key correctly typed as nullable, model has a sensible gpt-4o-mini default
unstract/sdk1/tests/test_openai_compatible_adapter.py Good coverage of registration, prefix normalisation, schema loading, and usage-recording fallback; _load_llm_class() is dead code
unstract/sdk1/src/unstract/sdk1/adapters/llm1/init.py Import and all entry added correctly for the new adapter
README.md Table updated to include OpenAI Compatible; last row now has an empty right cell due to the odd provider count

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class BaseAdapter {
        <<abstract>>
        +get_id()* str
        +get_name()* str
        +get_description()* str
        +get_provider()* str
        +get_icon()* str
        +get_json_schema() str
        +get_adapter_type()* AdapterTypes
    }
    class BaseChatCompletionParameters {
        +model: str
        +temperature: float
        +n: int
        +timeout: float
        +max_tokens: int
        +max_retries: int
        +validate()* dict
        +validate_model()* str
    }
    class OpenAICompatibleLLMParameters {
        +api_key: str | None
        +api_base: str
        +validate(adapter_metadata) dict
        +validate_model(adapter_metadata) str
    }
    class OpenAICompatibleLLMAdapter {
        +get_id() str
        +get_name() str
        +get_provider() str
        +get_adapter_type() AdapterTypes
    }
    class OpenAILLMParameters {
        +api_key: str
        +api_base: str
        +api_version: str
    }
    BaseChatCompletionParameters --|> BaseAdapter : inherits
    OpenAICompatibleLLMParameters --|> BaseChatCompletionParameters : inherits
    OpenAICompatibleLLMAdapter --|> OpenAICompatibleLLMParameters : inherits
    OpenAICompatibleLLMAdapter --|> BaseAdapter : inherits
    OpenAILLMParameters --|> BaseChatCompletionParameters : inherits
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: unstract/sdk1/tests/test_openai_compatible_adapter.py
Line: 28-29

Comment:
**`_load_llm_class()` is unused dead code**

This helper is defined but never called — every test function reaches into `_load_llm_module().LLM` directly. If it was meant to replace those inline accesses, the tests should use it; otherwise it can be removed to avoid confusion for future readers.

```suggestion

```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/llm.py
Line: 556-557

Comment:
**Redundant dict lookup obscures intent**

After the `if prompt_tokens is None:` block, `prompt_tokens` is already guaranteed to be non-`None` (either from `usage_data`, the `token_counter` result, or the `= 0` fallback). Re-fetching from `usage_data` with a default is therefore redundant and slightly obscures the logic.

Simplifying to the local variable makes the intent clearer:

```suggestion
        all_tokens = TokenCounterCompat(
            prompt_tokens=prompt_tokens or 0,
```

How can I resolve this? If you propose a fix, please make it concise.

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

Reviews (2): Last reviewed commit: "[pre-commit.ci] auto fixes from pre-comm..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
unstract/sdk1/src/unstract/sdk1/llm.py (1)

542-557: Avoid unconditional token estimation when usage already includes prompt tokens.

This currently computes token_counter() even when provider usage already has prompt tokens, which can create repeated warnings/noise for unmapped models without improving recorded usage.

♻️ Proposed refinement
-        try:
-            prompt_tokens = token_counter(model=model, messages=messages)
-        except Exception as e:
-            prompt_tokens = 0
-            logger.warning(
-                "[sdk1][LLM][%s][%s] Failed to estimate prompt tokens: %s",
-                model,
-                llm_api,
-                e,
-            )
         usage_data: Mapping[str, int] = usage or {}
+        prompt_tokens = usage_data.get("prompt_tokens")
+        if prompt_tokens is None:
+            try:
+                prompt_tokens = token_counter(model=model, messages=messages)
+            except Exception as e:
+                prompt_tokens = 0
+                logger.warning(
+                    "[sdk1][LLM][%s][%s] Failed to estimate prompt tokens: %s",
+                    model,
+                    llm_api,
+                    e,
+                )
         all_tokens = TokenCounterCompat(
-            prompt_tokens=usage_data.get("prompt_tokens", 0),
+            prompt_tokens=usage_data.get("prompt_tokens", prompt_tokens or 0),
             completion_tokens=usage_data.get("completion_tokens", 0),
             total_tokens=usage_data.get("total_tokens", 0),
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 542 - 557, The code
unconditionally calls token_counter(model, messages) even when usage already
contains prompt token counts; change the logic in the block around token_counter
and TokenCounterCompat so you first check usage (usage_data = usage or {}) and
if usage_data.get("prompt_tokens") is present use that value for prompt_tokens
instead of calling token_counter; only call token_counter(model, messages)
inside the try/except when usage_data lacks prompt_tokens, preserving the
existing exception handling and the logger.warning path, and then construct
TokenCounterCompat using the values from usage_data (falling back to the
estimated prompt_tokens when used).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json`:
- Around line 15-20: The schema for the "api_key" property currently only allows
a string which fails when runtime metadata contains null; update the "api_key"
entry in the JSON schema (the "api_key" property in custom_openai.json) to
permit null values by changing its type to accept both string and null (or add a
nullable:true equivalent) so stored configs with null pass validation and
editing flows.

---

Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 542-557: The code unconditionally calls token_counter(model,
messages) even when usage already contains prompt token counts; change the logic
in the block around token_counter and TokenCounterCompat so you first check
usage (usage_data = usage or {}) and if usage_data.get("prompt_tokens") is
present use that value for prompt_tokens instead of calling token_counter; only
call token_counter(model, messages) inside the try/except when usage_data lacks
prompt_tokens, preserving the existing exception handling and the logger.warning
path, and then construct TokenCounterCompat using the values from usage_data
(falling back to the estimated prompt_tokens when used).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bf841637-54b7-4802-9156-7f56e899ca54

📥 Commits

Reviewing files that changed from the base of the PR and between 7983c98 and 5090773.

📒 Files selected for processing (7)
  • README.md
  • unstract/sdk1/src/unstract/sdk1/adapters/base1.py
  • unstract/sdk1/src/unstract/sdk1/adapters/llm1/__init__.py
  • unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py
  • unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
  • unstract/sdk1/src/unstract/sdk1/llm.py
  • unstract/sdk1/tests/test_openai_compatible_adapter.py

@jimmyzhuu
Copy link
Copy Markdown
Author

Addressed the review follow-ups.

  • allowed api_key = null in the OpenAI-compatible schema
  • avoided redundant token_counter() calls when provider usage already includes prompt_tokens
  • tightened the tests around both cases
  • added ERNIE / Baidu Qianfan as schema examples while keeping the adapter generic

Validation re-run:

  • UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run pytest tests/test_openai_compatible_adapter.py
  • UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run ruff check src/unstract/sdk1/llm.py tests/test_openai_compatible_adapter.py

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud bot commented Apr 4, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants