feat(core): add prompt context telemetry breakdown#1169
Conversation
🦋 Changeset detectedLatest commit: d246ac7 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
This comment has been minimized.
This comment has been minimized.
📝 WalkthroughWalkthroughThis PR adds observability enhancements to the VoltAgent core package by introducing prompt-context token estimation capabilities and expanded token usage metrics on LLM spans. It includes a new module for estimating token consumption across system messages, non-system messages, and tool definitions, integrates these estimates into span attributes, and tracks cached and reasoning tokens. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Deploying voltagent with
|
| Latest commit: |
d246ac7
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://c98cee5d.voltagent.pages.dev |
| Branch Preview URL: | https://feat-prompt-context-telemetr.voltagent.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/core/src/agent/agent.ts (1)
4256-4273:⚠️ Potential issue | 🟠 MajorAvoid emitting synthetic zero values for cached/reasoning token metrics.
Line 4268 and Line 4271 currently treat
0as “present”. BecauseconvertUsage()defaults absent cached/reasoning values to0, spans can incorrectly report those metrics even when providers did not return them.💡 Proposed fix (presence-aware emission)
const { promptTokens, completionTokens, totalTokens, cachedInputTokens, reasoningTokens } = normalizedUsage; + const usageRecord = usage as Record<string, unknown>; + const hasOwn = (obj: unknown, key: string): boolean => + !!obj && typeof obj === "object" && Object.prototype.hasOwnProperty.call(obj, key); + const isNormalizedUsageInfo = "promptTokens" in usage; + + const hasCachedInputTokens = isNormalizedUsageInfo + ? (usage as UsageInfo).cachedInputTokens !== undefined + : hasOwn(usageRecord, "cachedInputTokens") || + hasOwn((usageRecord.inputTokenDetails as Record<string, unknown> | undefined) ?? {}, "cacheReadTokens") || + hasOwn((usageRecord.inputTokens as Record<string, unknown> | undefined) ?? {}, "cacheRead") || + hasOwn((usageRecord.inputTokens as Record<string, unknown> | undefined) ?? {}, "cacheReadTokens"); + + const hasReasoningTokens = isNormalizedUsageInfo + ? (usage as UsageInfo).reasoningTokens !== undefined + : hasOwn(usageRecord, "reasoningTokens") || + hasOwn((usageRecord.outputTokenDetails as Record<string, unknown> | undefined) ?? {}, "reasoningTokens") || + hasOwn((usageRecord.outputTokens as Record<string, unknown> | undefined) ?? {}, "reasoning") || + hasOwn((usageRecord.outputTokens as Record<string, unknown> | undefined) ?? {}, "reasoningTokens"); if (promptTokens !== undefined) { span.setAttribute("llm.usage.prompt_tokens", promptTokens); } @@ - if (cachedInputTokens !== undefined) { + if (hasCachedInputTokens && cachedInputTokens !== undefined) { span.setAttribute("llm.usage.cached_tokens", cachedInputTokens); } - if (reasoningTokens !== undefined) { + if (hasReasoningTokens && reasoningTokens !== undefined) { span.setAttribute("llm.usage.reasoning_tokens", reasoningTokens); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/core/src/agent/agent.ts` around lines 4256 - 4273, The span is currently being set for cachedInputTokens and reasoningTokens even when convertUsage defaulted absent values to 0; change the emission to be presence-aware by only calling span.setAttribute for cachedInputTokens and reasoningTokens when the value is both defined and greater than 0 (i.e., replace the current checks for cachedInputTokens !== undefined and reasoningTokens !== undefined with checks like cachedInputTokens !== undefined && cachedInputTokens > 0 and reasoningTokens !== undefined && reasoningTokens > 0), leaving promptTokens/completionTokens/totalTokens behavior unchanged; locate these checks around the normalizedUsage destructuring and the span.setAttribute calls in the agent.ts block that handles LLM usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@packages/core/src/agent/prompt-context-usage.ts`:
- Around line 166-171: The sanitizeRecord function currently only redacts
top-level keys and misses nested binary fields; update sanitizeRecord to
recursively traverse objects and arrays so any property whose key matches
LARGE_BINARY_KEYS (at any depth, e.g., content.metadata.data) is replaced with
"[omitted]" while primitives are kept as-is; detect and skip non-plain values
(null, Date, RegExp) and protect against circular references (e.g., via a seen
Set) to avoid infinite recursion; keep the function name sanitizeRecord and the
LARGE_BINARY_KEYS lookup so callers remain unchanged.
- Around line 203-205: The serialization currently includes runtime metadata
fields providerOptions, args, and needsApproval which inflate token estimates;
change the object built for safeStringify (used for token estimation) to include
only model-visible schema fields (description, inputSchema, outputSchema) and
remove providerOptions and needsApproval, and for args add a runtime type guard
that only includes args when it's a plain non-array object (use the existing
sanitizeRecord helper only after confirming !Array.isArray(candidate.args) &&
typeof candidate.args === 'object' && candidate.args !== null). Update the code
that constructs the object passed to safeStringify (where
candidate.providerOptions, candidate.args, and candidate.needsApproval are
currently spread) so it omits those runtime fields unless args passes the
plain-object guard.
---
Outside diff comments:
In `@packages/core/src/agent/agent.ts`:
- Around line 4256-4273: The span is currently being set for cachedInputTokens
and reasoningTokens even when convertUsage defaulted absent values to 0; change
the emission to be presence-aware by only calling span.setAttribute for
cachedInputTokens and reasoningTokens when the value is both defined and greater
than 0 (i.e., replace the current checks for cachedInputTokens !== undefined and
reasoningTokens !== undefined with checks like cachedInputTokens !== undefined
&& cachedInputTokens > 0 and reasoningTokens !== undefined && reasoningTokens >
0), leaving promptTokens/completionTokens/totalTokens behavior unchanged; locate
these checks around the normalizedUsage destructuring and the span.setAttribute
calls in the agent.ts block that handles LLM usage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f8b7ab96-0de2-460e-a209-2ec5d748ce23
📒 Files selected for processing (4)
.changeset/green-planes-jump.mdpackages/core/src/agent/agent.tspackages/core/src/agent/prompt-context-usage.spec.tspackages/core/src/agent/prompt-context-usage.ts
| function sanitizeRecord(record: Record<string, unknown>): Record<string, unknown> { | ||
| const sanitized: Record<string, unknown> = {}; | ||
|
|
||
| for (const [key, value] of Object.entries(record)) { | ||
| sanitized[key] = LARGE_BINARY_KEYS.has(key) ? "[omitted]" : value; | ||
| } |
There was a problem hiding this comment.
Deep binary fields are not sanitized, which can still blow up serialization cost.
On Line 170, only top-level keys are redacted. Nested payloads (e.g., content.metadata.data) still pass through and can create large safeStringify inputs in hot paths, skewing estimates and adding avoidable overhead.
Proposed fix (recursive sanitization)
-function sanitizeRecord(record: Record<string, unknown>): Record<string, unknown> {
- const sanitized: Record<string, unknown> = {};
-
- for (const [key, value] of Object.entries(record)) {
- sanitized[key] = LARGE_BINARY_KEYS.has(key) ? "[omitted]" : value;
- }
-
- return sanitized;
-}
+function sanitizeValue(value: unknown, depth = 0): unknown {
+ if (depth > 6) return "[omitted]";
+ if (Array.isArray(value)) return value.map((v) => sanitizeValue(v, depth + 1));
+ if (!value || typeof value !== "object") return value;
+
+ const sanitized: Record<string, unknown> = {};
+ for (const [key, entry] of Object.entries(value)) {
+ sanitized[key] = LARGE_BINARY_KEYS.has(key)
+ ? "[omitted]"
+ : sanitizeValue(entry, depth + 1);
+ }
+ return sanitized;
+}
+
+function sanitizeRecord(record: Record<string, unknown>): Record<string, unknown> {
+ return sanitizeValue(record) as Record<string, unknown>;
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function sanitizeRecord(record: Record<string, unknown>): Record<string, unknown> { | |
| const sanitized: Record<string, unknown> = {}; | |
| for (const [key, value] of Object.entries(record)) { | |
| sanitized[key] = LARGE_BINARY_KEYS.has(key) ? "[omitted]" : value; | |
| } | |
| function sanitizeValue(value: unknown, depth = 0): unknown { | |
| if (depth > 6) return "[omitted]"; | |
| if (Array.isArray(value)) return value.map((v) => sanitizeValue(v, depth + 1)); | |
| if (!value || typeof value !== "object") return value; | |
| const sanitized: Record<string, unknown> = {}; | |
| for (const [key, entry] of Object.entries(value)) { | |
| sanitized[key] = LARGE_BINARY_KEYS.has(key) | |
| ? "[omitted]" | |
| : sanitizeValue(entry, depth + 1); | |
| } | |
| return sanitized; | |
| } | |
| function sanitizeRecord(record: Record<string, unknown>): Record<string, unknown> { | |
| return sanitizeValue(record) as Record<string, unknown>; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/src/agent/prompt-context-usage.ts` around lines 166 - 171, The
sanitizeRecord function currently only redacts top-level keys and misses nested
binary fields; update sanitizeRecord to recursively traverse objects and arrays
so any property whose key matches LARGE_BINARY_KEYS (at any depth, e.g.,
content.metadata.data) is replaced with "[omitted]" while primitives are kept
as-is; detect and skip non-plain values (null, Date, RegExp) and protect against
circular references (e.g., via a seen Set) to avoid infinite recursion; keep the
function name sanitizeRecord and the LARGE_BINARY_KEYS lookup so callers remain
unchanged.
| ...(candidate.providerOptions ? { providerOptions: candidate.providerOptions } : {}), | ||
| ...(candidate.args ? { args: sanitizeRecord(candidate.args as Record<string, unknown>) } : {}), | ||
| ...(candidate.needsApproval !== undefined ? { needsApproval: candidate.needsApproval } : {}), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Check where tool payloads are transformed before provider/model calls:"
rg -nP -C3 '\b(tool|tools)\b.*\b(name|description|parameters|schema|inputSchema|outputSchema)\b' --type=ts
echo
echo "Check whether providerOptions / needsApproval / args are consumed in provider-facing payloads:"
rg -nP -C3 '\b(providerOptions|needsApproval|args)\b' --type=ts
echo
echo "Find function-valued needsApproval declarations (would skew stringify-based estimates):"
rg -nP -C2 'needsApproval\s*:\s*(async\s*)?(\([^)]*\)\s*=>|function\b)' --type=tsRepository: VoltAgent/voltagent
Length of output: 50376
🏁 Script executed:
# First, let's look at the exact content around lines 203-205 in prompt-context-usage.ts
cat -n packages/core/src/agent/prompt-context-usage.ts | sed -n '190,220p'Repository: VoltAgent/voltagent
Length of output: 1260
🏁 Script executed:
# Find the serializeToolDefinition function to understand what it does
rg -n "serializeToolDefinition" packages/core/src/agent/prompt-context-usage.ts -A 30Repository: VoltAgent/voltagent
Length of output: 2684
🏁 Script executed:
# Check the Tool type definition to understand field types
rg -n "interface Tool|type Tool" packages/core/src/tool/ --type=ts -A 20 | head -100Repository: VoltAgent/voltagent
Length of output: 6483
🏁 Script executed:
# Check if safeStringify is used in this file
rg -n "safeStringify|JSON.stringify" packages/core/src/agent/prompt-context-usage.tsRepository: VoltAgent/voltagent
Length of output: 271
🏁 Script executed:
# Find Tool interface definition to understand field types
rg -n "interface Tool|type Tool.*=" packages/core/src/tool/ --type=ts -B 2 -A 50 | grep -A 50 "^packages/core/src/tool/index.ts"Repository: VoltAgent/voltagent
Length of output: 7184
🏁 Script executed:
# Search for needsApproval type definition
rg -n "needsApproval" packages/core/src/tool/index.ts -B 3 -A 3Repository: VoltAgent/voltagent
Length of output: 900
🏁 Script executed:
# Check how providerOptions, needsApproval are used in provider adapters
rg -n "providerOptions|needsApproval" packages/core/src/tool/manager/ToolManager.ts -B 3 -A 3Repository: VoltAgent/voltagent
Length of output: 874
🏁 Script executed:
# Check what 'args' field is in ProviderTool definition
rg -n "type ProviderTool|interface ProviderTool" packages/core/src/tool/index.ts -A 10Repository: VoltAgent/voltagent
Length of output: 324
🏁 Script executed:
# Check the AI SDK's Tool type to understand which fields are model-visible
rg -n "from \"ai\"" packages/core/src/tool/index.tsRepository: VoltAgent/voltagent
Length of output: 159
🏁 Script executed:
# Look for how serializeToolDefinition output is actually used in token estimation
rg -n "serializeToolDefinition\|toolTokensEstimated" packages/core/src/agent/prompt-context-usage.ts -B 3 -A 3Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Check if ToolNeedsApprovalFunction is defined and what it looks like
rg -n "ToolNeedsApprovalFunction" packages/core/src/tool/index.ts -B 2 -A 3Repository: VoltAgent/voltagent
Length of output: 767
🏁 Script executed:
# Check what safeStringify does and how it handles functions
rg -n "function safeStringify" packages/core/src -A 15 --type=tsRepository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Look at how sanitizeRecord works
rg -n "function sanitizeRecord|export.*sanitizeRecord" packages/core/src/agent/prompt-context-usage.ts -B 2 -A 10Repository: VoltAgent/voltagent
Length of output: 494
🏁 Script executed:
# See where args comes from in the tool object and if it should be in schema
rg -n "\.args\s*=" packages/core/src/tool/ --type=ts | head -20Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Find safeStringify implementation in `@voltagent/internal`
fd -t f "*.ts" | xargs rg -l "safeStringify" | head -5Repository: VoltAgent/voltagent
Length of output: 566
🏁 Script executed:
# Check how AI SDK's Tool type is used and what fields it expects
rg -n "VercelTool\[" packages/core/src/tool/manager/ToolManager.ts -B 3 -A 5Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Verify what serializeToolDefinition output is for: check comments around it
rg -n "serializeToolDefinition\|toolTokensEstimated" packages/core/src/agent/prompt-context-usage.ts -B 5 -A 2Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Look at the AI SDK Tool type definition being used
rg -n "Tool\s*\{|Tool\s*<" packages/core/src/tool/manager/ToolManager.ts -B 2 -A 20 | head -80Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Check comments in prompt-context-usage.ts about what this serialization is for
cat -n packages/core/src/agent/prompt-context-usage.ts | sed -n '40,80p'Repository: VoltAgent/voltagent
Length of output: 1507
🏁 Script executed:
# See what fields the AI SDK Tool actually expects/sends to model
rg -n "description.*inputSchema\|parameters.*description" packages/core/src/tool/manager/ToolManager.ts -B 3 -A 3Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Check the actual AI SDK tool field documentation - look at how tools are used in providers
rg -n "tools\[.*\]\s*=" packages/core/src/agent/providers/ --type=ts -B 2 -A 5 | head -60Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# See how ToolManager constructs tools for the AI SDK
cat -n packages/core/src/tool/manager/ToolManager.ts | sed -n '63,92p'Repository: VoltAgent/voltagent
Length of output: 1294
🏁 Script executed:
# Verify what fields AI SDK actually sends to model by checking provider integrations
rg -n "generateText|doGenerate" packages/core/src/agent/providers/anthropic/ --type=ts -A 10 | head -40Repository: VoltAgent/voltagent
Length of output: 147
🏁 Script executed:
# Check if there are any tests or documentation about what serializeToolDefinition should include
rg -n "serializeToolDefinition" packages/core/src/agent/prompt-context-usage.spec.ts -B 5 -A 10Repository: VoltAgent/voltagent
Length of output: 45
🏁 Script executed:
# Look at test files for prompt-context-usage
fd -type f "prompt-context-usage" packages/core/src/Repository: VoltAgent/voltagent
Length of output: 233
🏁 Script executed:
# Check what the AI SDK actually sends to the model for tool definitions
rg -n "name.*description.*inputSchema\|description.*inputSchema" packages/core/src/tool/manager/ToolManager.ts -B 3 -A 3Repository: VoltAgent/voltagent
Length of output: 45
Remove runtime metadata fields from tool schema serialization to avoid inflating token estimates.
Lines 203–205 currently include providerOptions, args, and needsApproval, which are runtime/control metadata rather than model-visible schema fields. Since these are passed to safeStringify for token estimation (line 62), they artificially inflate tool_tokens_estimated. The AI SDK sends only description, inputSchema, and outputSchema to the model; restrict serialization to schema fields only.
For args, add a runtime type guard to ensure it is a plain object (not an array):
Suggested fix
- ...(candidate.providerOptions ? { providerOptions: candidate.providerOptions } : {}),
- ...(candidate.args ? { args: sanitizeRecord(candidate.args as Record<string, unknown>) } : {}),
- ...(candidate.needsApproval !== undefined ? { needsApproval: candidate.needsApproval } : {}),
+ ...(candidate.args && typeof candidate.args === "object" && !Array.isArray(candidate.args)
+ ? { args: sanitizeRecord(candidate.args as Record<string, unknown>) }
+ : {}),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ...(candidate.providerOptions ? { providerOptions: candidate.providerOptions } : {}), | |
| ...(candidate.args ? { args: sanitizeRecord(candidate.args as Record<string, unknown>) } : {}), | |
| ...(candidate.needsApproval !== undefined ? { needsApproval: candidate.needsApproval } : {}), | |
| ...(candidate.args && typeof candidate.args === "object" && !Array.isArray(candidate.args) | |
| ? { args: sanitizeRecord(candidate.args as Record<string, unknown>) } | |
| : {}), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/core/src/agent/prompt-context-usage.ts` around lines 203 - 205, The
serialization currently includes runtime metadata fields providerOptions, args,
and needsApproval which inflate token estimates; change the object built for
safeStringify (used for token estimation) to include only model-visible schema
fields (description, inputSchema, outputSchema) and remove providerOptions and
needsApproval, and for args add a runtime type guard that only includes args
when it's a plain non-array object (use the existing sanitizeRecord helper only
after confirming !Array.isArray(candidate.args) && typeof candidate.args ===
'object' && candidate.args !== null). Update the code that constructs the object
passed to safeStringify (where candidate.providerOptions, candidate.args, and
candidate.needsApproval are currently spread) so it omits those runtime fields
unless args passes the plain-object guard.
PR Checklist
Please check if your PR fulfills the following requirements:
Bugs / Features
What is the current behavior?
LLM spans only expose aggregate usage values. Observability consumers cannot tell how much of the prompt context came from system instructions, conversation messages, or tool schemas. Cached and reasoning token usage are also not exposed on LLM spans.
What is the new behavior?
LLM spans now include an estimated prompt-context breakdown for system instructions, conversation messages, and tool schemas. Cached and reasoning token usage are also recorded on LLM spans, and the estimation helper is covered by tests.
fixes (issue)
N/A
Notes for reviewers
packages/core/src/voltops/client.tshas unrelated local changes in my working tree and is intentionally excluded from this PR.Summary by cubic
Adds an estimated prompt context breakdown to LLM spans in
@voltagent/core, separating system instructions, non-system messages, and tool schemas. Also records cached and reasoning token usage for better observability.usage.prompt_context.*.llm.usage.cached_tokensandllm.usage.reasoning_tokenson spans.estimatePromptContextUsageandpromptContextUsageEstimateToAttributeshelpers with unit tests.Written for commit d246ac7. Summary will update on new commits.
Summary by CodeRabbit
New Features
Tests