-
Notifications
You must be signed in to change notification settings - Fork 621
UN-3266 [FIX] Async execution backend stabilization #1903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
152 commits
Select commit
Hold shift + click to select a range
2da4907
Execution backend - revamp
harini-venkataraman 41eeef8
async flow
harini-venkataraman f66dfb2
Streaming progress to FE
harini-venkataraman 95c6592
Removing multi hop in Prompt studio ide and structure tool
harini-venkataraman d8cc6cc
Merge origin/main into feat/execution-backend
Deepak-Kesavan 44a2b3f
Merge remote-tracking branch 'origin/main' into feat/execution-backend
Deepak-Kesavan 2f4f2dc
UN-3234 [FIX] Add beta tag to agentic prompt studio navigation item
Deepak-Kesavan d041201
Added executors for agentic prompt studio
harini-venkataraman 0a0cfb1
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman a4e1fd7
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman ae77d6a
Added executors for agentic prompt studio
harini-venkataraman 5c22956
Added executors for agentic prompt studio
harini-venkataraman 3cc3213
Removed redundant envs
harini-venkataraman d0532f8
Removed redundant envs
harini-venkataraman 6173df5
Removed redundant envs
harini-venkataraman bbe6f58
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] a3dc912
Removed redundant envs
harini-venkataraman 98c8071
Merge branch 'main' of github.com:Zipstack/unstract into feat/executi…
harini-venkataraman 21157ac
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman 0216b59
Removed redundant envs
harini-venkataraman db81b9d
Removed redundant envs
harini-venkataraman e1da202
Removed redundant envs
harini-venkataraman d119797
Removed redundant envs
harini-venkataraman fbadbf8
Removed redundant envs
harini-venkataraman 882296e
Removed redundant envs
harini-venkataraman 6d3bbbf
Removed redundant envs
harini-venkataraman 292460b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] f35c0e6
Removed redundant envs
harini-venkataraman 9bcb458
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman 0cbd10a
adding worker for callbacks
harini-venkataraman 2b1ab1e
adding worker for callbacks
harini-venkataraman 4122f08
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1ceb352
adding worker for callbacks
harini-venkataraman d69304d
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman 7c1266b
adding worker for callbacks
harini-venkataraman 0b84d9e
adding worker for callbacks
harini-venkataraman 5b0629d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 98ee4b9
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman 2dffcef
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman 3b35fb2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1ab6031
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman 15c3daf
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman 7ae1a74
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] fbf9c29
Pluggable apps and plugins to fit the new async prompt execution arch…
harini-venkataraman ec2f762
Merge branch 'feat/execution-backend' of github.com:Zipstack/unstract…
harini-venkataraman d6a3c5e
adding worker for callbacks
harini-venkataraman 5c23ab0
adding worker for callbacks
harini-venkataraman 525024f
adding worker for callbacks
harini-venkataraman a8cbce1
adding worker for callbacks
harini-venkataraman 549f17a
adding worker for callbacks
harini-venkataraman f9b86a9
adding worker for callbacks
harini-venkataraman 5369e5a
adding worker for callbacks
harini-venkataraman b5205ff
adding worker for callbacks
harini-venkataraman 9659661
fix: write output files in agentic extraction pipeline
harini-venkataraman 67eef62
UN-3266 fix: replace hardcoded /tmp paths with secure temp dirs in te…
harini-venkataraman 3f4cc7d
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman a563a35
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 9b422da
Update docs
harini-venkataraman 6a6e8e9
Merge branch 'feat/async-prompt-service-v2' of github.com:Zipstack/un…
harini-venkataraman 817fc1c
UN-3266 fix: remove dead code with undefined names in fetch_response
harini-venkataraman d9bc50f
Un 3266 fix security hotspot tmp paths (#1851)
harini-venkataraman b715f64
UN-3266 fix: resolve SonarCloud bugs S2259 and S1244 in PR #1849
harini-venkataraman e9c23b2
UN-3266 fix: resolve SonarCloud code smells in PR #1849
harini-venkataraman f59755a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 4bf9736
UN-3266 fix: wrap long log message in dispatcher.py to fix E501
harini-venkataraman 0531870
UN-3266 fix: resolve remaining SonarCloud S117 naming violations
harini-venkataraman a2edb23
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 3f86131
UN-3266 fix: resolve remaining SonarCloud code smells in PR #1849
harini-venkataraman 45e61c4
UN-3266 fix: resolve SonarCloud cognitive complexity and code smell v…
harini-venkataraman 6391c6c
UN-3266 fix: remove unused RetrievalStrategy import from _handle_answ…
harini-venkataraman 0af0484
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 807e405
UN-3266 fix: rename UsageHelper params to lowercase (N803)
harini-venkataraman 9bdb3f5
UN-3266 fix: resolve remaining SonarCloud issues from check run 66691…
harini-venkataraman 18eafe9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 7a01a35
UN-3266 fix: remove unused locals in _handle_answer_prompt (F841)
harini-venkataraman 3e5ce31
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman e3ca0c6
fix: resolve Biome linting errors in frontend source files
harini-venkataraman db3d8c2
fix: replace dynamic import of SharePermission with static import in …
harini-venkataraman a62a9fd
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman b3a90af
fix: resolve SonarCloud warnings in frontend components
harini-venkataraman 4200ac1
Merge branch 'main' into feat/async-prompt-service-v2
ritwik-g 1c58eb9
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman 8fdb680
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman 79adb41
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman 9749083
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] e8515d5
Address PR #1849 review comments: fix null guards, dead code, and tes…
harini-venkataraman 2be161b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 7a740a2
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman 3d9f540
Fix missing llm_usage_reason for summarize LLM usage tracking
harini-venkataraman 26d8c4a
UN-3266 [FIX] Fix single-pass extraction routing in LegacyExecutor
harini-venkataraman 4879b10
Fixing API depployment response mismatches
harini-venkataraman 8057527
Fix single-pass extraction showing only one prompt result in real-time
harini-venkataraman d96a521
Move summarize from sync Django plugin to executor worker for IDE index
harini-venkataraman a40b681
Address PR #1849 review comments: null guards, thread safety
harini-venkataraman 4966919
Add documentation to ExecutionResponse DTO describing result structure
harini-venkataraman 8e29665
Fix PR review issues: IDOR, null guards, rollback, spinner, summarize…
harini-venkataraman 58825ef
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman e1cec00
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 53fe9fc
Fix CI, tests, and add async prompt studio improvements
harini-venkataraman 1468a97
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 9964c43
Fix pre-existing biome CI errors: import ordering and formatting
harini-venkataraman 44f72f8
Fix ruff F821: add missing transaction import in prompt_studio_helper
harini-venkataraman bdf2916
Add input validation guards to bulk_fetch_response endpoint
harini-venkataraman 3989ad4
Merge branch 'main' into feat/async-prompt-service-v2
kirtimanmishrazipstack 0424443
Merge branch 'main' into feat/async-prompt-service-v2
harini-venkataraman 28f2224
IDE Call backs
harini-venkataraman 834df68
Sonar issues fix
harini-venkataraman 8ed6b47
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 2b333a3
Fix ruff errors: restore summary_profile variable, suppress TC001 in …
harini-venkataraman 0628ed1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 666563a
Update bun.lock to match package.json dependency ranges
harini-venkataraman e489e45
Fix all biome lint warnings: empty blocks, missing braces, forEach re…
harini-venkataraman 0aae584
Move ExecutionContext import into TYPE_CHECKING block
harini-venkataraman 2ba2c21
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 3e7f808
Fix SonarQube issues: duplication, naming, nesting, unused var
harini-venkataraman 42eaed8
Replace worker-ide-callback Dockerfile with worker-unified
harini-venkataraman ac58c0e
Add celery_executor_agentic queue to executor worker
harini-venkataraman 8114849
FIxing email enforce type
harini-venkataraman 2b35695
Removing line-item from select choices
harini-venkataraman 0deb08d
Merge main
harini-venkataraman b5afee1
Update workers/shared/enums/worker_enums_base.py
harini-venkataraman 19ea4fc
Update backend/workflow_manager/workflow_v2/workflow_helper.py
harini-venkataraman c6cdffb
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 0879e82
Fix false success logs and silent failures in ETL destination pipelines
harini-venkataraman 822e040
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1eae4e2
Merge branch 'main' into fix/agentic-executor-queue
kirtimanmishrazipstack 802eddb
Revert ETL destination pipeline changes — deferring to next cut
harini-venkataraman 0b16930
Fix false success logs and missing data in ETL destination pipelines
harini-venkataraman 0bea4cc
Fix missing context_retrieval metric for single pass extraction
harini-venkataraman b53b37b
Fix Unstructured IO adapter PermissionError on remote storage
harini-venkataraman 13f25d4
Defer subscription usage tracking to IDE callback workers
harini-venkataraman 7344e61
Fix missing embedding metadata in API deployment with chunking
harini-venkataraman c02ef1b
Fix email enforce type returning "NA" string and surface null in FE
harini-venkataraman c7aacc8
Feat/line item executor plugin (#1899)
harini-venkataraman a0884e7
Guard against undefined connector type in PostHog event lookup
harini-venkataraman 3dd8b56
Add worker-executor-v2 service to docker-compose under workers-v2 pro…
harini-venkataraman 7757586
Feat/line item executor plugin (#1900)
harini-venkataraman 98ec941
Task pipeline fixes
harini-venkataraman 8334515
Merge branch 'fix/agentic-executor-queue' of github.com:Zipstack/unst…
harini-venkataraman 0a91221
Fixing null fonts
harini-venkataraman e1bbc80
Merge branch 'main' into fix/agentic-executor-queue
harini-venkataraman 6f2ce13
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 0533ced
Fix biome formatting in DisplayPromptResult
harini-venkataraman 9a6f31a
Drop unlabeled LLM rows from per-model usage breakdown
harini-venkataraman 095c7d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] e35af2f
Fix Sonar issues: cognitive complexity, params, dup, test smells
harini-venkataraman 7421f3b
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1a79030
Addressing greptile comments
harini-venkataraman 5c3b67c
Addressing greptile comments
harini-venkataraman adda29e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1b0a1e1
Address PR review on legacy_executor single-pass and extraction strea…
harini-venkataraman c5b5da6
Merge branch 'main' into fix/agentic-executor-queue
harini-venkataraman File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
harini-venkataraman marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,251 @@ | ||
| """Regression tests for ``UsageHelper.get_usage_by_model``. | ||
|
|
||
| These tests cover the defensive filter that drops unlabeled LLM rows | ||
| from the per-model usage breakdown. The filter prevents a malformed | ||
| bare ``"llm"`` bucket from leaking into API deployment responses when | ||
| a producer-side LLM call site forgets to set ``llm_usage_reason``. | ||
|
|
||
| The tests deliberately do not require a live Django database — the | ||
| backend test environment has no ``pytest-django``, no SQLite fallback, | ||
| and uses ``django-tenants`` against Postgres in production. Instead | ||
| the tests stub ``account_usage.models`` and ``usage_v2.models`` in | ||
| ``sys.modules`` *before* importing the helper, so the helper module | ||
| loads cleanly without triggering Django's app registry checks. The | ||
| fake ``Usage.objects.filter`` chain returns a deterministic list of | ||
| row dicts shaped exactly like the real ``.values(...).annotate(...)`` | ||
| queryset rows the helper iterates over. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import sys | ||
| import types | ||
| from typing import Any | ||
| from unittest.mock import MagicMock | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Module-level stubs. Must run BEFORE ``usage_v2.helper`` is imported, so we | ||
| # do it at import time and capture the helper reference for the tests below. | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| def _install_stubs() -> tuple[Any, Any]: | ||
| """Install fake ``account_usage.models`` and ``usage_v2.models`` modules | ||
| so that ``usage_v2.helper`` can be imported without Django being set up. | ||
|
|
||
| Returns ``(UsageHelper, FakeUsage)`` — the helper class to test and the | ||
| fake Usage class whose ``objects.filter`` we will swap per-test. | ||
| """ | ||
| # Fake account_usage package + models module | ||
| if "account_usage" not in sys.modules: | ||
| account_usage_pkg = types.ModuleType("account_usage") | ||
| account_usage_pkg.__path__ = [] # mark as package | ||
| sys.modules["account_usage"] = account_usage_pkg | ||
| if "account_usage.models" not in sys.modules: | ||
| account_usage_models = types.ModuleType("account_usage.models") | ||
| account_usage_models.PageUsage = MagicMock(name="PageUsage") | ||
| sys.modules["account_usage.models"] = account_usage_models | ||
|
|
||
| # Fake usage_v2.models with a Usage class whose ``objects`` is a | ||
| # MagicMock (so each test can rebind ``filter.return_value``). | ||
| if "usage_v2.models" not in sys.modules or not hasattr( | ||
| sys.modules["usage_v2.models"], "_is_test_stub" | ||
| ): | ||
| usage_v2_models = types.ModuleType("usage_v2.models") | ||
| usage_v2_models._is_test_stub = True | ||
|
|
||
| class _FakeUsage: | ||
| objects = MagicMock(name="Usage.objects") | ||
|
|
||
| usage_v2_models.Usage = _FakeUsage | ||
| sys.modules["usage_v2.models"] = usage_v2_models | ||
|
|
||
| # Now import the helper — this picks up our stubs. | ||
| from usage_v2.helper import UsageHelper | ||
|
|
||
| return UsageHelper, sys.modules["usage_v2.models"].Usage | ||
|
|
||
|
|
||
| UsageHelper, FakeUsage = _install_stubs() | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Helpers | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class _StubQueryset: | ||
| """Mimic the chain ``.filter(...).values(...).annotate(...)``.""" | ||
|
|
||
| def __init__(self, rows: list[dict[str, Any]]) -> None: | ||
| self._rows = rows | ||
|
|
||
| def values(self, *args: Any, **kwargs: Any) -> _StubQueryset: | ||
| return self | ||
|
|
||
| def annotate(self, *args: Any, **kwargs: Any) -> list[dict[str, Any]]: | ||
| return self._rows | ||
|
|
||
|
|
||
| def _row( | ||
| *, | ||
| usage_type: str, | ||
| llm_reason: str, | ||
| model_name: str = "gpt-4o", | ||
| sum_input: int = 0, | ||
| sum_output: int = 0, | ||
| sum_total: int = 0, | ||
| sum_embedding: int = 0, | ||
| sum_cost: float = 0.0, | ||
| ) -> dict[str, Any]: | ||
| """Build a row matching the shape returned by the helper's | ||
| ``.values(...).annotate(...)`` queryset. | ||
| """ | ||
| return { | ||
| "usage_type": usage_type, | ||
| "llm_usage_reason": llm_reason, | ||
| "model_name": model_name, | ||
| "sum_input_tokens": sum_input, | ||
| "sum_output_tokens": sum_output, | ||
| "sum_total_tokens": sum_total, | ||
| "sum_embedding_tokens": sum_embedding, | ||
| "sum_cost": sum_cost, | ||
| } | ||
|
|
||
|
|
||
| def _stub_rows(rows: list[dict[str, Any]]) -> None: | ||
| """Make ``Usage.objects.filter(...).values(...).annotate(...)`` yield | ||
| the given rows when the helper is invoked next. | ||
| """ | ||
| FakeUsage.objects.filter.return_value = _StubQueryset(rows) | ||
|
|
||
|
|
||
| # --------------------------------------------------------------------------- | ||
| # Tests | ||
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| def test_unlabeled_llm_row_is_dropped() -> None: | ||
| """An ``llm`` row with empty ``llm_usage_reason`` must not produce a | ||
| bare ``"llm"`` bucket in the response — it should be silently | ||
| dropped, while the legitimate extraction row is preserved. | ||
| """ | ||
| _stub_rows( | ||
| [ | ||
| _row( | ||
| usage_type="llm", | ||
| llm_reason="extraction", | ||
| sum_input=100, | ||
| sum_output=50, | ||
| sum_total=150, | ||
| sum_cost=0.05, | ||
| ), | ||
| _row( | ||
| usage_type="llm", | ||
| llm_reason="", # the bug — no reason set | ||
| sum_cost=0.01, | ||
| ), | ||
| ] | ||
| ) | ||
|
|
||
| result = UsageHelper.get_usage_by_model("00000000-0000-0000-0000-000000000001") | ||
|
|
||
| assert "llm" not in result, ( | ||
| "Unlabeled llm row should be dropped — bare 'llm' bucket leaked" | ||
| ) | ||
| assert "extraction_llm" in result | ||
| assert len(result["extraction_llm"]) == 1 | ||
| entry = result["extraction_llm"][0] | ||
| assert entry["model_name"] == "gpt-4o" | ||
| assert entry["input_tokens"] == 100 | ||
| assert entry["output_tokens"] == 50 | ||
| assert entry["total_tokens"] == 150 | ||
| assert entry["cost_in_dollars"] == "0.05" | ||
|
|
||
|
|
||
| def test_embedding_row_is_preserved() -> None: | ||
| """An ``embedding`` row legitimately has empty ``llm_usage_reason``; | ||
| the defensive filter must NOT drop it. Proves the guard is scoped | ||
| to ``usage_type == 'llm'``. | ||
| """ | ||
| _stub_rows( | ||
| [ | ||
| _row( | ||
| usage_type="embedding", | ||
| llm_reason="", | ||
| model_name="text-embedding-3-small", | ||
| sum_embedding=200, | ||
| sum_cost=0.001, | ||
| ), | ||
| ] | ||
| ) | ||
|
|
||
| result = UsageHelper.get_usage_by_model("00000000-0000-0000-0000-000000000002") | ||
|
|
||
| assert "embedding" in result, "Embedding row was incorrectly dropped" | ||
| assert len(result["embedding"]) == 1 | ||
| entry = result["embedding"][0] | ||
| assert entry["model_name"] == "text-embedding-3-small" | ||
| assert entry["embedding_tokens"] == 200 | ||
| assert entry["cost_in_dollars"] == "0.001" | ||
|
|
||
|
|
||
| def test_all_three_llm_reasons_coexist() -> None: | ||
| """All three labelled LLM buckets (extraction, challenge, summarize) | ||
| must appear with correct token counts when present. | ||
| """ | ||
| _stub_rows( | ||
| [ | ||
| _row( | ||
| usage_type="llm", | ||
| llm_reason="extraction", | ||
| model_name="gpt-4o", | ||
| sum_input=100, | ||
| sum_output=50, | ||
| sum_total=150, | ||
| sum_cost=0.05, | ||
| ), | ||
| _row( | ||
| usage_type="llm", | ||
| llm_reason="challenge", | ||
| model_name="gpt-4o-mini", | ||
| sum_input=20, | ||
| sum_output=10, | ||
| sum_total=30, | ||
| sum_cost=0.002, | ||
| ), | ||
| _row( | ||
| usage_type="llm", | ||
| llm_reason="summarize", | ||
| model_name="gpt-4o", | ||
| sum_input=300, | ||
| sum_output=80, | ||
| sum_total=380, | ||
| sum_cost=0.07, | ||
| ), | ||
| ] | ||
| ) | ||
|
|
||
| result = UsageHelper.get_usage_by_model("00000000-0000-0000-0000-000000000003") | ||
|
|
||
| assert set(result.keys()) == {"extraction_llm", "challenge_llm", "summarize_llm"} | ||
| assert "llm" not in result | ||
|
|
||
| extraction = result["extraction_llm"][0] | ||
| assert extraction["model_name"] == "gpt-4o" | ||
| assert extraction["input_tokens"] == 100 | ||
| assert extraction["output_tokens"] == 50 | ||
| assert extraction["total_tokens"] == 150 | ||
|
|
||
| challenge = result["challenge_llm"][0] | ||
| assert challenge["model_name"] == "gpt-4o-mini" | ||
| assert challenge["input_tokens"] == 20 | ||
| assert challenge["output_tokens"] == 10 | ||
| assert challenge["total_tokens"] == 30 | ||
|
|
||
| summarize = result["summarize_llm"][0] | ||
| assert summarize["model_name"] == "gpt-4o" | ||
| assert summarize["input_tokens"] == 300 | ||
| assert summarize["output_tokens"] == 80 | ||
| assert summarize["total_tokens"] == 380 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.