Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
152 commits
Select commit Hold shift + click to select a range
2da4907
Execution backend - revamp
harini-venkataraman Feb 19, 2026
41eeef8
async flow
harini-venkataraman Feb 19, 2026
f66dfb2
Streaming progress to FE
harini-venkataraman Feb 24, 2026
95c6592
Removing multi hop in Prompt studio ide and structure tool
harini-venkataraman Feb 25, 2026
d8cc6cc
Merge origin/main into feat/execution-backend
Deepak-Kesavan Feb 28, 2026
44a2b3f
Merge remote-tracking branch 'origin/main' into feat/execution-backend
Deepak-Kesavan Mar 2, 2026
2f4f2dc
UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
Deepak-Kesavan Mar 2, 2026
d041201
Added executors for agentic prompt studio
harini-venkataraman Mar 2, 2026
0a0cfb1
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman Mar 2, 2026
a4e1fd7
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman Mar 2, 2026
ae77d6a
Added executors for agentic prompt studio
harini-venkataraman Mar 2, 2026
5c22956
Added executors for agentic prompt studio
harini-venkataraman Mar 2, 2026
3cc3213
Removed redundant envs
harini-venkataraman Mar 2, 2026
d0532f8
Removed redundant envs
harini-venkataraman Mar 2, 2026
6173df5
Removed redundant envs
harini-venkataraman Mar 3, 2026
bbe6f58
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 3, 2026
a3dc912
Removed redundant envs
harini-venkataraman Mar 3, 2026
98c8071
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman Mar 3, 2026
21157ac
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 3, 2026
0216b59
Removed redundant envs
harini-venkataraman Mar 3, 2026
db81b9d
Removed redundant envs
harini-venkataraman Mar 3, 2026
e1da202
Removed redundant envs
harini-venkataraman Mar 3, 2026
d119797
Removed redundant envs
harini-venkataraman Mar 3, 2026
fbadbf8
Removed redundant envs
harini-venkataraman Mar 3, 2026
882296e
Removed redundant envs
harini-venkataraman Mar 4, 2026
6d3bbbf
Removed redundant envs
harini-venkataraman Mar 4, 2026
292460b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 4, 2026
f35c0e6
Removed redundant envs
harini-venkataraman Mar 4, 2026
9bcb458
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 4, 2026
0cbd10a
adding worker for callbacks
harini-venkataraman Mar 4, 2026
2b1ab1e
adding worker for callbacks
harini-venkataraman Mar 5, 2026
4122f08
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 5, 2026
1ceb352
adding worker for callbacks
harini-venkataraman Mar 5, 2026
d69304d
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 5, 2026
7c1266b
adding worker for callbacks
harini-venkataraman Mar 5, 2026
0b84d9e
adding worker for callbacks
harini-venkataraman Mar 5, 2026
5b0629d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 5, 2026
98ee4b9
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman Mar 6, 2026
2dffcef
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 6, 2026
3b35fb2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 6, 2026
1ab6031
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman Mar 6, 2026
15c3daf
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 6, 2026
7ae1a74
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 6, 2026
fbf9c29
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman Mar 9, 2026
ec2f762
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman Mar 9, 2026
d6a3c5e
adding worker for callbacks
harini-venkataraman Mar 9, 2026
5c23ab0
adding worker for callbacks
harini-venkataraman Mar 9, 2026
525024f
adding worker for callbacks
harini-venkataraman Mar 9, 2026
a8cbce1
adding worker for callbacks
harini-venkataraman Mar 9, 2026
549f17a
adding worker for callbacks
harini-venkataraman Mar 9, 2026
f9b86a9
adding worker for callbacks
harini-venkataraman Mar 10, 2026
5369e5a
adding worker for callbacks
harini-venkataraman Mar 10, 2026
b5205ff
adding worker for callbacks
harini-venkataraman Mar 10, 2026
9659661
fix: write output files in agentic extraction pipeline
harini-venkataraman Mar 11, 2026
67eef62
UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in te…
harini-venkataraman Mar 11, 2026
3f4cc7d
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 11, 2026
a563a35
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
9b422da
Update docs
harini-venkataraman Mar 11, 2026
6a6e8e9
Merge branch 'feat/async-prompt-service-v2' of github.com:Zipstack/un…
harini-venkataraman Mar 11, 2026
817fc1c
UN-3266 fix: remove dead code with undefined names in fetch_response
harini-venkataraman Mar 11, 2026
d9bc50f
Un 3266 fix security hotspot tmp paths (#1851)
harini-venkataraman Mar 11, 2026
b715f64
UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
harini-venkataraman Mar 11, 2026
e9c23b2
UN-3266 fix: resolve SonarCloud code smells in PR #1849
harini-venkataraman Mar 11, 2026
f59755a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
4bf9736
UN-3266 fix: wrap long log message in dispatcher.py to fix E501
harini-venkataraman Mar 11, 2026
0531870
UN-3266 fix: resolve remaining SonarCloud S117 naming violations
harini-venkataraman Mar 11, 2026
a2edb23
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
3f86131
UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
harini-venkataraman Mar 11, 2026
45e61c4
UN-3266 fix: resolve SonarCloud cognitive complexity and code smell v…
harini-venkataraman Mar 11, 2026
6391c6c
UN-3266 fix: remove unused RetrievalStrategy import from _handle_answ…
harini-venkataraman Mar 11, 2026
0af0484
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
807e405
UN-3266 fix: rename UsageHelper params to lowercase (N803)
harini-venkataraman Mar 11, 2026
9bdb3f5
UN-3266 fix: resolve remaining SonarCloud issues from check run 66691…
harini-venkataraman Mar 11, 2026
18eafe9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 11, 2026
7a01a35
UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
harini-venkataraman Mar 11, 2026
3e5ce31
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 12, 2026
e3ca0c6
fix: resolve Biome linting errors in frontend source files
harini-venkataraman Mar 12, 2026
db3d8c2
fix: replace dynamic import of SharePermission with static import in …
harini-venkataraman Mar 12, 2026
a62a9fd
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 12, 2026
b3a90af
fix: resolve SonarCloud warnings in frontend components
harini-venkataraman Mar 12, 2026
4200ac1
Merge branch 'main' into feat/async-prompt-service-v2
ritwik-g Mar 12, 2026
1c58eb9
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 18, 2026
8fdb680
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 19, 2026
79adb41
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 19, 2026
9749083
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 19, 2026
e8515d5
Address PR #1849 review comments: fix null guards, dead code, and tes…
harini-venkataraman Mar 19, 2026
2be161b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 19, 2026
7a740a2
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 19, 2026
3d9f540
Fix missing llm_usage_reason for summarize LLM usage tracking
harini-venkataraman Mar 23, 2026
26d8c4a
UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
harini-venkataraman Mar 23, 2026
4879b10
Fixing API depployment response mismatches
harini-venkataraman Mar 23, 2026
8057527
Fix single-pass extraction showing only one prompt result in real-time
harini-venkataraman Mar 25, 2026
d96a521
Move summarize from sync Django plugin to executor worker for IDE index
harini-venkataraman Mar 25, 2026
a40b681
Address PR #1849 review comments: null guards, thread safety
harini-venkataraman Mar 25, 2026
4966919
Add documentation to ExecutionResponse DTO describing result structure
harini-venkataraman Mar 25, 2026
8e29665
Fix PR review issues: IDOR, null guards, rollback, spinner, summarize…
harini-venkataraman Mar 26, 2026
58825ef
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 26, 2026
e1cec00
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 26, 2026
53fe9fc
Fix CI, tests, and add async prompt studio improvements
harini-venkataraman Mar 26, 2026
1468a97
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 26, 2026
9964c43
Fix pre-existing biome CI errors: import ordering and formatting
harini-venkataraman Mar 26, 2026
44f72f8
Fix ruff F821: add missing transaction import in prompt_studio_helper
harini-venkataraman Mar 26, 2026
bdf2916
Add input validation guards to bulk_fetch_response endpoint
harini-venkataraman Mar 26, 2026
3989ad4
Merge branch 'main' into feat/async-prompt-service-v2
kirtimanmishrazipstack Mar 31, 2026
0424443
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman Mar 31, 2026
28f2224
IDE Call backs
harini-venkataraman Mar 31, 2026
834df68
Sonar issues fix
harini-venkataraman Mar 31, 2026
8ed6b47
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 31, 2026
2b333a3
Fix ruff errors: restore summary_profile variable, suppress TC001 in …
harini-venkataraman Mar 31, 2026
0628ed1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 31, 2026
666563a
Update bun.lock to match package.json dependency ranges
harini-venkataraman Mar 31, 2026
e489e45
Fix all biome lint warnings: empty blocks, missing braces, forEach re…
harini-venkataraman Mar 31, 2026
0aae584
Move ExecutionContext import into TYPE_CHECKING block
harini-venkataraman Mar 31, 2026
2ba2c21
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Mar 31, 2026
3e7f808
Fix SonarQube issues: duplication, naming, nesting, unused var
harini-venkataraman Mar 31, 2026
42eaed8
Replace worker-ide-callback Dockerfile with worker-unified
harini-venkataraman Mar 31, 2026
ac58c0e
Add celery_executor_agentic queue to executor worker
harini-venkataraman Apr 2, 2026
8114849
FIxing email enforce type
harini-venkataraman Apr 3, 2026
2b35695
Removing line-item from select choices
harini-venkataraman Apr 3, 2026
0deb08d
Merge main
harini-venkataraman Apr 3, 2026
b5afee1
Update workers/shared/enums/worker_enums_base.py
harini-venkataraman Apr 3, 2026
19ea4fc
Update backend/workflow_manager/workflow_v2/workflow_helper.py
harini-venkataraman Apr 3, 2026
c6cdffb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 3, 2026
0879e82
Fix false success logs and silent failures in ETL destination pipelines
harini-venkataraman Apr 3, 2026
822e040
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 3, 2026
1eae4e2
Merge branch 'main' into fix/agentic-executor-queue
kirtimanmishrazipstack Apr 3, 2026
802eddb
Revert ETL destination pipeline changes — deferring to next cut
harini-venkataraman Apr 3, 2026
0b16930
Fix false success logs and missing data in ETL destination pipelines
harini-venkataraman Apr 6, 2026
0bea4cc
Fix missing context_retrieval metric for single pass extraction
harini-venkataraman Apr 6, 2026
b53b37b
Fix Unstructured IO adapter PermissionError on remote storage
harini-venkataraman Apr 6, 2026
13f25d4
Defer subscription usage tracking to IDE callback workers
harini-venkataraman Apr 6, 2026
7344e61
Fix missing embedding metadata in API deployment with chunking
harini-venkataraman Apr 6, 2026
c02ef1b
Fix email enforce type returning "NA" string and surface null in FE
harini-venkataraman Apr 6, 2026
c7aacc8
Feat/line item executor plugin (#1899)
harini-venkataraman Apr 6, 2026
a0884e7
Guard against undefined connector type in PostHog event lookup
harini-venkataraman Apr 6, 2026
3dd8b56
Add worker-executor-v2 service to docker-compose under workers-v2 pro…
harini-venkataraman Apr 6, 2026
7757586
Feat/line item executor plugin (#1900)
harini-venkataraman Apr 6, 2026
98ec941
Task pipeline fixes
harini-venkataraman Apr 6, 2026
8334515
Merge branch 'fix/agentic-executor-queue' of github.com:Zipstack/unst…
harini-venkataraman Apr 6, 2026
0a91221
Fixing null fonts
harini-venkataraman Apr 6, 2026
e1bbc80
Merge branch 'main' into fix/agentic-executor-queue
harini-venkataraman Apr 6, 2026
6f2ce13
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
0533ced
Fix biome formatting in DisplayPromptResult
harini-venkataraman Apr 6, 2026
9a6f31a
Drop unlabeled LLM rows from per-model usage breakdown
harini-venkataraman Apr 6, 2026
095c7d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
e35af2f
Fix Sonar issues: cognitive complexity, params, dup, test smells
harini-venkataraman Apr 6, 2026
7421f3b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
1a79030
Addressing greptile comments
harini-venkataraman Apr 6, 2026
5c3b67c
Addressing greptile comments
harini-venkataraman Apr 6, 2026
adda29e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Apr 6, 2026
1b0a1e1
Address PR review on legacy_executor single-pass and extraction strea…
harini-venkataraman Apr 7, 2026
c5b5da6
Merge branch 'main' into fix/agentic-executor-queue
harini-venkataraman Apr 7, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@
from utils.file_storage.constants import FileStorageKeys
from utils.file_storage.helpers.prompt_studio_file_helper import PromptStudioFileHelper
from utils.local_context import StateStore
from utils.subscription_usage_decorator import track_subscription_usage_if_available

from backend.celery_service import app as celery_app
from prompt_studio.prompt_profile_manager_v2.models import ProfileManager
Expand Down Expand Up @@ -1235,7 +1234,6 @@ def fetch_prompt_from_tool(tool_id: str) -> list[ToolStudioPrompt]:
return prompt_instances

@staticmethod
@track_subscription_usage_if_available(file_execution_id_param="run_id")
def index_document(
tool_id: str,
file_name: str,
Expand Down Expand Up @@ -1424,7 +1422,6 @@ def summarize(file_name, org_id, run_id, tool) -> str:
return summarize_file_path

@staticmethod
@track_subscription_usage_if_available(file_execution_id_param="run_id")
def prompt_responder(
tool_id: str,
org_id: str,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@
"date":"date",
"boolean":"boolean",
"json":"json",
"table":"table"
"table":"table",
"line-item":"line-item"
},
"output_processing":{
"DEFAULT":"Default"
Expand Down
19 changes: 19 additions & 0 deletions backend/usage_v2/helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,25 @@ def get_usage_by_model(run_id: str) -> dict[str, list[dict[str, Any]]]:
for row in rows:
usage_type = row["usage_type"]
llm_reason = row["llm_usage_reason"]

# Drop unlabeled LLM rows entirely. Per the Usage model
# contract (see usage_v2/models.py: llm_usage_reason
# db_comment), an empty reason is only valid when
# usage_type == "embedding". An empty reason combined with
# usage_type == "llm" is a producer-side bug (an LLM call
# site forgot to pass llm_usage_reason in usage_kwargs).
# Without this guard the row would surface in API
# deployment responses as a malformed bare "llm" bucket
# with no token breakdown.
if usage_type == "llm" and not llm_reason:
logger.warning(
"Dropping unlabeled LLM usage row from per-model "
"breakdown: model_name=%s run_id=%s",
row["model_name"],
run_id,
)
continue

cost_str = UsageHelper._format_float_positional(row["sum_cost"] or 0.0)

key = usage_type
Expand Down
Empty file.
251 changes: 251 additions & 0 deletions backend/usage_v2/tests/test_helper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,251 @@
"""Regression tests for ``UsageHelper.get_usage_by_model``.

These tests cover the defensive filter that drops unlabeled LLM rows
from the per-model usage breakdown. The filter prevents a malformed
bare ``"llm"`` bucket from leaking into API deployment responses when
a producer-side LLM call site forgets to set ``llm_usage_reason``.

The tests deliberately do not require a live Django database — the
backend test environment has no ``pytest-django``, no SQLite fallback,
and uses ``django-tenants`` against Postgres in production. Instead
the tests stub ``account_usage.models`` and ``usage_v2.models`` in
``sys.modules`` *before* importing the helper, so the helper module
loads cleanly without triggering Django's app registry checks. The
fake ``Usage.objects.filter`` chain returns a deterministic list of
row dicts shaped exactly like the real ``.values(...).annotate(...)``
queryset rows the helper iterates over.
"""

from __future__ import annotations

import sys
import types
from typing import Any
from unittest.mock import MagicMock


# ---------------------------------------------------------------------------
# Module-level stubs. Must run BEFORE ``usage_v2.helper`` is imported, so we
# do it at import time and capture the helper reference for the tests below.
# ---------------------------------------------------------------------------


def _install_stubs() -> tuple[Any, Any]:
"""Install fake ``account_usage.models`` and ``usage_v2.models`` modules
so that ``usage_v2.helper`` can be imported without Django being set up.

Returns ``(UsageHelper, FakeUsage)`` — the helper class to test and the
fake Usage class whose ``objects.filter`` we will swap per-test.
"""
# Fake account_usage package + models module
if "account_usage" not in sys.modules:
account_usage_pkg = types.ModuleType("account_usage")
account_usage_pkg.__path__ = [] # mark as package
sys.modules["account_usage"] = account_usage_pkg
if "account_usage.models" not in sys.modules:
account_usage_models = types.ModuleType("account_usage.models")
account_usage_models.PageUsage = MagicMock(name="PageUsage")
sys.modules["account_usage.models"] = account_usage_models

# Fake usage_v2.models with a Usage class whose ``objects`` is a
# MagicMock (so each test can rebind ``filter.return_value``).
if "usage_v2.models" not in sys.modules or not hasattr(
sys.modules["usage_v2.models"], "_is_test_stub"
):
usage_v2_models = types.ModuleType("usage_v2.models")
usage_v2_models._is_test_stub = True

class _FakeUsage:
objects = MagicMock(name="Usage.objects")

usage_v2_models.Usage = _FakeUsage
sys.modules["usage_v2.models"] = usage_v2_models

# Now import the helper — this picks up our stubs.
from usage_v2.helper import UsageHelper

return UsageHelper, sys.modules["usage_v2.models"].Usage


UsageHelper, FakeUsage = _install_stubs()


# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------


class _StubQueryset:
"""Mimic the chain ``.filter(...).values(...).annotate(...)``."""

def __init__(self, rows: list[dict[str, Any]]) -> None:
self._rows = rows

def values(self, *args: Any, **kwargs: Any) -> _StubQueryset:
return self

def annotate(self, *args: Any, **kwargs: Any) -> list[dict[str, Any]]:
return self._rows


def _row(
*,
usage_type: str,
llm_reason: str,
model_name: str = "gpt-4o",
sum_input: int = 0,
sum_output: int = 0,
sum_total: int = 0,
sum_embedding: int = 0,
sum_cost: float = 0.0,
) -> dict[str, Any]:
"""Build a row matching the shape returned by the helper's
``.values(...).annotate(...)`` queryset.
"""
return {
"usage_type": usage_type,
"llm_usage_reason": llm_reason,
"model_name": model_name,
"sum_input_tokens": sum_input,
"sum_output_tokens": sum_output,
"sum_total_tokens": sum_total,
"sum_embedding_tokens": sum_embedding,
"sum_cost": sum_cost,
}


def _stub_rows(rows: list[dict[str, Any]]) -> None:
"""Make ``Usage.objects.filter(...).values(...).annotate(...)`` yield
the given rows when the helper is invoked next.
"""
FakeUsage.objects.filter.return_value = _StubQueryset(rows)


# ---------------------------------------------------------------------------
# Tests
# ---------------------------------------------------------------------------


def test_unlabeled_llm_row_is_dropped() -> None:
"""An ``llm`` row with empty ``llm_usage_reason`` must not produce a
bare ``"llm"`` bucket in the response — it should be silently
dropped, while the legitimate extraction row is preserved.
"""
_stub_rows(
[
_row(
usage_type="llm",
llm_reason="extraction",
sum_input=100,
sum_output=50,
sum_total=150,
sum_cost=0.05,
),
_row(
usage_type="llm",
llm_reason="", # the bug — no reason set
sum_cost=0.01,
),
]
)

result = UsageHelper.get_usage_by_model("00000000-0000-0000-0000-000000000001")

assert "llm" not in result, (
"Unlabeled llm row should be dropped — bare 'llm' bucket leaked"
)
assert "extraction_llm" in result
assert len(result["extraction_llm"]) == 1
entry = result["extraction_llm"][0]
assert entry["model_name"] == "gpt-4o"
assert entry["input_tokens"] == 100
assert entry["output_tokens"] == 50
assert entry["total_tokens"] == 150
assert entry["cost_in_dollars"] == "0.05"


def test_embedding_row_is_preserved() -> None:
"""An ``embedding`` row legitimately has empty ``llm_usage_reason``;
the defensive filter must NOT drop it. Proves the guard is scoped
to ``usage_type == 'llm'``.
"""
_stub_rows(
[
_row(
usage_type="embedding",
llm_reason="",
model_name="text-embedding-3-small",
sum_embedding=200,
sum_cost=0.001,
),
]
)

result = UsageHelper.get_usage_by_model("00000000-0000-0000-0000-000000000002")

assert "embedding" in result, "Embedding row was incorrectly dropped"
assert len(result["embedding"]) == 1
entry = result["embedding"][0]
assert entry["model_name"] == "text-embedding-3-small"
assert entry["embedding_tokens"] == 200
assert entry["cost_in_dollars"] == "0.001"


def test_all_three_llm_reasons_coexist() -> None:
"""All three labelled LLM buckets (extraction, challenge, summarize)
must appear with correct token counts when present.
"""
_stub_rows(
[
_row(
usage_type="llm",
llm_reason="extraction",
model_name="gpt-4o",
sum_input=100,
sum_output=50,
sum_total=150,
sum_cost=0.05,
),
_row(
usage_type="llm",
llm_reason="challenge",
model_name="gpt-4o-mini",
sum_input=20,
sum_output=10,
sum_total=30,
sum_cost=0.002,
),
_row(
usage_type="llm",
llm_reason="summarize",
model_name="gpt-4o",
sum_input=300,
sum_output=80,
sum_total=380,
sum_cost=0.07,
),
]
)

result = UsageHelper.get_usage_by_model("00000000-0000-0000-0000-000000000003")

assert set(result.keys()) == {"extraction_llm", "challenge_llm", "summarize_llm"}
assert "llm" not in result

extraction = result["extraction_llm"][0]
assert extraction["model_name"] == "gpt-4o"
assert extraction["input_tokens"] == 100
assert extraction["output_tokens"] == 50
assert extraction["total_tokens"] == 150

challenge = result["challenge_llm"][0]
assert challenge["model_name"] == "gpt-4o-mini"
assert challenge["input_tokens"] == 20
assert challenge["output_tokens"] == 10
assert challenge["total_tokens"] == 30

summarize = result["summarize_llm"][0]
assert summarize["model_name"] == "gpt-4o"
assert summarize["input_tokens"] == 300
assert summarize["output_tokens"] == 80
assert summarize["total_tokens"] == 380
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ function DisplayPromptResult({
);
}

if (output === undefined || output === null) {
if (output === undefined) {
return (
<Typography.Text className="prompt-not-ran">
<span>
Expand All @@ -95,6 +95,12 @@ function DisplayPromptResult({
);
}

if (output === null) {
return (
<Typography.Text className="prompt-output-result">null</Typography.Text>
);
}

// Extract confidence from 5th element of highlight data coordinate arrays
const extractConfidenceFromHighlightData = (data) => {
if (!data) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,10 @@ const baseProps = {
};

describe("DisplayPromptResult null/undefined guard", () => {
it("shows 'Yet to run' when output is null", () => {
it("shows 'null' literal when output is null (ran but produced no value)", () => {
render(<DisplayPromptResult {...baseProps} output={null} />);
expect(screen.getByText(/Yet to run/)).toBeInTheDocument();
expect(screen.getByText("null")).toBeInTheDocument();
expect(screen.queryByText(/Yet to run/)).not.toBeInTheDocument();
});

it("shows 'Yet to run' when output is undefined", () => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -279,8 +279,8 @@ function ConfigureDs({

url = getUrl("connector/");

const eventKey = `${type.toUpperCase()}`;
if (posthogConnectorAddedEventText[eventKey]) {
const eventKey = type?.toUpperCase();
if (eventKey && posthogConnectorAddedEventText[eventKey]) {
setPostHogCustomEvent(posthogConnectorAddedEventText[eventKey], {
info: `Clicked on 'Submit' button`,
connector_name: selectedSourceName,
Expand Down
22 changes: 11 additions & 11 deletions unstract/sdk1/src/unstract/sdk1/adapters/x2text/helper.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
import logging
import os
from io import BytesIO
from typing import Any

import requests
Expand Down Expand Up @@ -72,17 +74,15 @@ def process_document(
fs = FileStorage(provider=FileStorageProvider.LOCAL)
try:
response: Response
local_storage = FileStorage(FileStorageProvider.LOCAL)
if not local_storage.exists(input_file_path):
fs.download(from_path=input_file_path, to_path=input_file_path)
with open(input_file_path, "rb") as input_f:
mime_type = local_storage.mime_type(path=input_file_path)
files = {"file": (input_file_path, input_f, mime_type)}
response = UnstructuredHelper.make_request(
unstructured_adapter_config=unstructured_adapter_config,
request_type=UnstructuredHelper.PROCESS,
files=files,
)
file_bytes = fs.read(path=input_file_path, mode="rb")
mime_type = fs.mime_type(path=input_file_path)
file_name = os.path.basename(input_file_path)
files = {"file": (file_name, BytesIO(file_bytes), mime_type)}
response = UnstructuredHelper.make_request(
unstructured_adapter_config=unstructured_adapter_config,
request_type=UnstructuredHelper.PROCESS,
files=files,
)
output, is_success = X2TextHelper.parse_response(
response=response, out_file_path=output_file_path, fs=fs
)
Expand Down
Loading