Add a dedicated OpenAI-compatible LLM adapter by jimmyzhuu · Pull Request #1895 · Zipstack/unstract

jimmyzhuu · 2026-04-04T15:28:51Z

Summary

This PR adds a dedicated OpenAI Compatible LLM adapter for OpenAI-style chat completion endpoints that are not the official OpenAI service.

The implementation is intentionally small in scope:

adds a new OpenAI Compatible LLM adapter backed by LiteLLM's custom_openai path
keeps the existing OpenAI adapter unchanged
adds adapter registration and JSON schema
adds focused tests for registration, model normalization, schema loading, and usage recording
documents the new adapter in the README

Why

Users may already have access to OpenAI-compatible endpoints behind a private gateway or third-party provider, but the current OpenAI adapter is specifically shaped around official OpenAI semantics.

Using a separate adapter keeps those semantics explicit and avoids broadening the meaning of the existing OpenAI adapter.

Refs #1894
Refs #856
Refs #1443

Scope

This PR is limited to:

LLM only
no embedding changes
no x2text / OCR changes
no changes to the existing OpenAI adapter behavior

Notes

For custom models that are not mapped in LiteLLM's token metadata, usage recording now falls back gracefully instead of failing while estimating prompt tokens.

This keeps the adapter usable for custom OpenAI-compatible endpoints without changing pricing semantics in this PR.

Validation

UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run pytest tests/test_openai_compatible_adapter.py
UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run ruff check src/unstract/sdk1/adapters/base1.py src/unstract/sdk1/adapters/llm1/__init__.py src/unstract/sdk1/adapters/llm1/openai_compatible.py src/unstract/sdk1/llm.py tests/test_openai_compatible_adapter.py

CLAassistant · 2026-04-04T15:29:01Z

All committers have signed the CLA.

coderabbitai · 2026-04-04T15:29:06Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c35a8b53-2cee-4f8f-8bc2-54148529da44

📥 Commits

Reviewing files that changed from the base of the PR and between 5090773 and d3e1cad.

📒 Files selected for processing (3)

unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
unstract/sdk1/src/unstract/sdk1/llm.py
unstract/sdk1/tests/test_openai_compatible_adapter.py

✅ Files skipped from review due to trivial changes (1)

unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json

🚧 Files skipped from review as they are similar to previous changes (2)

unstract/sdk1/src/unstract/sdk1/llm.py
unstract/sdk1/tests/test_openai_compatible_adapter.py

Summary by CodeRabbit

New Features
- Added comprehensive support for OpenAI-compatible LLM providers, enabling flexible integration with OpenAI API-compatible services through customizable API endpoints and authentication credentials.
Bug Fixes
- Enhanced error resilience for token counting operations to gracefully handle edge cases with unmapped models, ensuring usage tracking continues without interruption.
Documentation
- Restructured LLM providers reference table in README.

Walkthrough

Adds support for OpenAI-compatible LLM providers: new OpenAICompatibleLLMAdapter and OpenAICompatibleLLMParameters, a JSON configuration schema, README provider-table updates, safer token-counting in usage recording, and tests for registration, validation, schema, and usage behavior.

Changes

Cohort / File(s)	Summary
Documentation `README.md`	Updated the "LLM Providers" table: replaced left-column entries with "OpenAI Compatible", reordered right-column providers, and removed one right-column provider cell leaving a blank status cell.
Parameter Validation `unstract/sdk1/src/unstract/sdk1/adapters/base1.py`	Added `OpenAICompatibleLLMParameters` with `api_key`, `api_base`, `validate()` and `validate_model()` to normalize model names to `custom_openai/{model}`.
Adapter Implementation `unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py`, `unstract/sdk1/src/unstract/sdk1/adapters/llm1/__init__.py`	Added `OpenAICompatibleLLMAdapter` with static ID/metadata, provider `custom_openai`, icon, and exported it via package `__all__`.
Configuration Schema `unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json`	Added JSON schema defining required `adapter_name` and `api_base`, optional `api_key`, `model` (default `gpt-4o-mini`), `max_tokens`, `max_retries`, and `timeout`.
Usage / Error Handling `unstract/sdk1/src/unstract/sdk1/llm.py`	Changed `_record_usage` to read `prompt_tokens` from provider usage when present; call `token_counter()` only if missing and catch exceptions (set `prompt_tokens=0` and log a warning) so usage push proceeds.
Tests `unstract/sdk1/tests/test_openai_compatible_adapter.py`	Added tests covering adapter registration, parameter validation/model normalization, schema contents, and usage recording behavior including token-counter failure handling.

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant Adapter as OpenAI Compatible Adapter
    participant Validator as Parameter Validator
    participant LLM as LLM Instance
    participant TokenCounter as Token Counter
    participant Audit as Audit System

    App->>Adapter: create/instantiate with config
    Adapter->>Validator: validate(adapter_metadata)
    Validator->>Validator: normalize model -> custom_openai/{model}
    Validator-->>Adapter: return validated params
    App->>LLM: trigger _record_usage(usage_data)
    alt usage_data contains prompt_tokens
        LLM-->>TokenCounter: (skipped)
    else usage_data missing prompt_tokens
        LLM->>TokenCounter: token_counter(model, messages)
        alt TokenCounter succeeds
            TokenCounter-->>LLM: prompt token count
        else TokenCounter raises
            TokenCounter-->>LLM: exception
            LLM->>LLM: catch, set prompt_tokens = 0, log warning
        end
    end
    LLM->>Audit: push_usage_data(with prompt_tokens)
    Audit-->>LLM: acknowledge

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description covers most required sections: Summary, Why, and Scope are well-documented. However, critical sections like 'Can this PR break any existing features' and 'Notes on Testing' from the template are missing.	Add explicit 'Can this PR break existing features' section (confirming no impact to existing OpenAI adapter) and expand 'Notes on Testing' with test command details and expected outcomes.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Add a dedicated OpenAI-compatible LLM adapter' accurately and concisely summarizes the main change—introducing a new OpenAI-compatible LLM adapter while keeping the existing OpenAI adapter unchanged.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

for more information, see https://pre-commit.ci

greptile-apps · 2026-04-04T15:36:07Z

Greptile Summary

This PR introduces a dedicated OpenAICompatibleLLMAdapter backed by LiteLLM's custom_openai path, allowing users to connect to OpenAI-style chat-completion endpoints that are not the official OpenAI service. It also adds a graceful fallback in _record_usage so that token_counter failures for unmapped custom models no longer surface as exceptions.

New adapter (openai_compatible.py, base1.py): Clean implementation following the existing class hierarchy (BaseChatCompletionParameters → OpenAICompatibleLLMParameters → OpenAICompatibleLLMAdapter). The validate_model method correctly guards against double-prefixing with custom_openai/.
JSON schema (custom_openai.json): api_key is rightly optional (some private gateways do not require one); model ships with a sensible gpt-4o-mini default and the schema is well-formed.
_record_usage change (llm.py): The fallback path when token_counter raises is correct; usage data is still pushed to Audit with prompt_tokens=0 when estimation is impossible. The final usage_data.get("prompt_tokens", prompt_tokens or 0) expression is redundant (see inline comment) but functionally safe.
Tests: Cover registration, model-prefix normalisation, schema loading, and both branches of the new token_counter fallback logic. Well-structured with the previously-flagged concerns addressed (JSON parsed instead of raw-string matched, magic stub explained).
Minor: _load_llm_class() is defined but never called in the test file (dead code), and the usage_data.get in _record_usage can be simplified to prompt_tokens or 0 — both are P2 style items only.

Confidence Score: 5/5

Safe to merge — no P0 or P1 issues found; all remaining findings are style-level P2.

The new adapter follows existing codebase patterns exactly, the JSON schema and Pydantic model are consistent, the _record_usage fallback is logically correct, and the test suite covers the key behaviours. Only minor dead code and a redundant dict lookup were found.

No files require special attention. test_openai_compatible_adapter.py has a small dead-code item (_load_llm_class), and llm.py has a trivially redundant expression that could be cleaned up.

Important Files Changed

Filename	Overview
unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py	New adapter class — clean implementation following existing patterns; all required static methods present
unstract/sdk1/src/unstract/sdk1/adapters/base1.py	Adds OpenAICompatibleLLMParameters with correct custom_openai/ prefix guard and optional api_key
unstract/sdk1/src/unstract/sdk1/llm.py	Adds graceful token_counter fallback in _record_usage; logic is correct but final usage_data.get call is redundant
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json	Well-formed schema; api_key correctly typed as nullable, model has a sensible gpt-4o-mini default
unstract/sdk1/tests/test_openai_compatible_adapter.py	Good coverage of registration, prefix normalisation, schema loading, and usage-recording fallback; _load_llm_class() is dead code
unstract/sdk1/src/unstract/sdk1/adapters/llm1/init.py	Import and all entry added correctly for the new adapter
README.md	Table updated to include OpenAI Compatible; last row now has an empty right cell due to the odd provider count

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class BaseAdapter {
        <<abstract>>
        +get_id()* str
        +get_name()* str
        +get_description()* str
        +get_provider()* str
        +get_icon()* str
        +get_json_schema() str
        +get_adapter_type()* AdapterTypes
    }
    class BaseChatCompletionParameters {
        +model: str
        +temperature: float
        +n: int
        +timeout: float
        +max_tokens: int
        +max_retries: int
        +validate()* dict
        +validate_model()* str
    }
    class OpenAICompatibleLLMParameters {
        +api_key: str | None
        +api_base: str
        +validate(adapter_metadata) dict
        +validate_model(adapter_metadata) str
    }
    class OpenAICompatibleLLMAdapter {
        +get_id() str
        +get_name() str
        +get_provider() str
        +get_adapter_type() AdapterTypes
    }
    class OpenAILLMParameters {
        +api_key: str
        +api_base: str
        +api_version: str
    }
    BaseChatCompletionParameters --|> BaseAdapter : inherits
    OpenAICompatibleLLMParameters --|> BaseChatCompletionParameters : inherits
    OpenAICompatibleLLMAdapter --|> OpenAICompatibleLLMParameters : inherits
    OpenAICompatibleLLMAdapter --|> BaseAdapter : inherits
    OpenAILLMParameters --|> BaseChatCompletionParameters : inherits

Prompt To Fix All With AI

This is a comment left during a code review.
Path: unstract/sdk1/tests/test_openai_compatible_adapter.py
Line: 28-29

Comment:
**`_load_llm_class()` is unused dead code**

This helper is defined but never called — every test function reaches into `_load_llm_module().LLM` directly. If it was meant to replace those inline accesses, the tests should use it; otherwise it can be removed to avoid confusion for future readers.

```suggestion

```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/llm.py
Line: 556-557

Comment:
**Redundant dict lookup obscures intent**

After the `if prompt_tokens is None:` block, `prompt_tokens` is already guaranteed to be non-`None` (either from `usage_data`, the `token_counter` result, or the `= 0` fallback). Re-fetching from `usage_data` with a default is therefore redundant and slightly obscures the logic.

Simplifying to the local variable makes the intent clearer:

```suggestion
        all_tokens = TokenCounterCompat(
            prompt_tokens=prompt_tokens or 0,
```

How can I resolve this? If you propose a fix, please make it concise.

_{Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.}

_{Reviews (2): Last reviewed commit: "[pre-commit.ci] auto fixes from pre-comm..." | Re-trigger Greptile}

unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json

unstract/sdk1/tests/test_openai_compatible_adapter.py

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

unstract/sdk1/src/unstract/sdk1/llm.py (1)

542-557: Avoid unconditional token estimation when usage already includes prompt tokens.

This currently computes token_counter() even when provider usage already has prompt tokens, which can create repeated warnings/noise for unmapped models without improving recorded usage.

♻️ Proposed refinement

-        try:
-            prompt_tokens = token_counter(model=model, messages=messages)
-        except Exception as e:
-            prompt_tokens = 0
-            logger.warning(
-                "[sdk1][LLM][%s][%s] Failed to estimate prompt tokens: %s",
-                model,
-                llm_api,
-                e,
-            )
         usage_data: Mapping[str, int] = usage or {}
+        prompt_tokens = usage_data.get("prompt_tokens")
+        if prompt_tokens is None:
+            try:
+                prompt_tokens = token_counter(model=model, messages=messages)
+            except Exception as e:
+                prompt_tokens = 0
+                logger.warning(
+                    "[sdk1][LLM][%s][%s] Failed to estimate prompt tokens: %s",
+                    model,
+                    llm_api,
+                    e,
+                )
         all_tokens = TokenCounterCompat(
-            prompt_tokens=usage_data.get("prompt_tokens", 0),
+            prompt_tokens=usage_data.get("prompt_tokens", prompt_tokens or 0),
             completion_tokens=usage_data.get("completion_tokens", 0),
             total_tokens=usage_data.get("total_tokens", 0),
         )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@unstract/sdk1/src/unstract/sdk1/llm.py` around lines 542 - 557, The code
unconditionally calls token_counter(model, messages) even when usage already
contains prompt token counts; change the logic in the block around token_counter
and TokenCounterCompat so you first check usage (usage_data = usage or {}) and
if usage_data.get("prompt_tokens") is present use that value for prompt_tokens
instead of calling token_counter; only call token_counter(model, messages)
inside the try/except when usage_data lacks prompt_tokens, preserving the
existing exception handling and the logger.warning path, and then construct
TokenCounterCompat using the values from usage_data (falling back to the
estimated prompt_tokens when used).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json`:
- Around line 15-20: The schema for the "api_key" property currently only allows
a string which fails when runtime metadata contains null; update the "api_key"
entry in the JSON schema (the "api_key" property in custom_openai.json) to
permit null values by changing its type to accept both string and null (or add a
nullable:true equivalent) so stored configs with null pass validation and
editing flows.

---

Nitpick comments:
In `@unstract/sdk1/src/unstract/sdk1/llm.py`:
- Around line 542-557: The code unconditionally calls token_counter(model,
messages) even when usage already contains prompt token counts; change the logic
in the block around token_counter and TokenCounterCompat so you first check
usage (usage_data = usage or {}) and if usage_data.get("prompt_tokens") is
present use that value for prompt_tokens instead of calling token_counter; only
call token_counter(model, messages) inside the try/except when usage_data lacks
prompt_tokens, preserving the existing exception handling and the logger.warning
path, and then construct TokenCounterCompat using the values from usage_data
(falling back to the estimated prompt_tokens when used).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bf841637-54b7-4802-9156-7f56e899ca54

📥 Commits

Reviewing files that changed from the base of the PR and between 7983c98 and 5090773.

📒 Files selected for processing (7)

README.md
unstract/sdk1/src/unstract/sdk1/adapters/base1.py
unstract/sdk1/src/unstract/sdk1/adapters/llm1/__init__.py
unstract/sdk1/src/unstract/sdk1/adapters/llm1/openai_compatible.py
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json
unstract/sdk1/src/unstract/sdk1/llm.py
unstract/sdk1/tests/test_openai_compatible_adapter.py

unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json

for more information, see https://pre-commit.ci

jimmyzhuu · 2026-04-04T16:39:01Z

Addressed the review follow-ups.

allowed api_key = null in the OpenAI-compatible schema
avoided redundant token_counter() calls when provider usage already includes prompt_tokens
tightened the tests around both cases
added ERNIE / Baidu Qianfan as schema examples while keeping the adapter generic

Validation re-run:

UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run pytest tests/test_openai_compatible_adapter.py
UV_SKIP_WHEEL_FILENAME_CHECK=1 uv run ruff check src/unstract/sdk1/llm.py tests/test_openai_compatible_adapter.py

sonarqubecloud · 2026-04-04T16:39:18Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Add OpenAI-compatible LLM adapter

8b97716

[pre-commit.ci] auto fixes from pre-commit.com hooks

5090773

for more information, see https://pre-commit.ci

greptile-apps bot reviewed Apr 4, 2026

View reviewed changes

unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json Show resolved Hide resolved

unstract/sdk1/tests/test_openai_compatible_adapter.py Outdated Show resolved Hide resolved

unstract/sdk1/tests/test_openai_compatible_adapter.py Show resolved Hide resolved

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/custom_openai.json Show resolved Hide resolved

jimmyzhuu and others added 3 commits April 5, 2026 00:35

Address review feedback for custom OpenAI adapter

f1d6dff

Fix import formatting after rebase

b4d0af1

[pre-commit.ci] auto fixes from pre-commit.com hooks

d3e1cad

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a dedicated OpenAI-compatible LLM adapter#1895

Add a dedicated OpenAI-compatible LLM adapter#1895
jimmyzhuu wants to merge 5 commits intoZipstack:mainfrom
jimmyzhuu:codex/openai-compatible-llm-adapter

jimmyzhuu commented Apr 4, 2026

Uh oh!

CLAassistant commented Apr 4, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 4, 2026 •

edited

Loading

❌ Failed checks (2 warnings)

Uh oh!

greptile-apps bot commented Apr 4, 2026 •

edited

Loading

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Class Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

jimmyzhuu commented Apr 4, 2026

Uh oh!

sonarqubecloud bot commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jimmyzhuu commented Apr 4, 2026

Summary

Why

Scope

Notes

Validation

Uh oh!

CLAassistant commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

greptile-apps bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Class Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jimmyzhuu commented Apr 4, 2026

Uh oh!

sonarqubecloud bot commented Apr 4, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Apr 4, 2026 •

edited

Loading

coderabbitai bot commented Apr 4, 2026 •

edited

Loading

greptile-apps bot commented Apr 4, 2026 •

edited

Loading