fix(vertexai): preserve individual message roles in Content object lists by OiPunk · Pull Request #3761 · traceloop/openllmetry

OiPunk · 2026-03-04T20:12:16Z

Summary

When generate_content() receives a list of Content objects with different roles (e.g. user/model conversation history), the _set_input_attributes function was treating the entire list as flat content items and merging everything into a single prompt with role="user". This caused the entire conversation history to be stringified into one attribute, losing message separation and role information.

Root Cause

_process_vertexai_argument() (and its sync variant) did not check whether list items are Content objects (which have role and parts attributes). It treated all list items as raw content (strings or Part objects), deep-copied them, and processed them as a single flat list.

Fix

Added _is_content_object() helper to detect VertexAI Content objects by checking for role and parts attributes
Added _process_content_object() / _process_content_object_sync() to extract the role and process each Content's parts individually
Updated _process_vertexai_argument() / _process_vertexai_argument_sync() to detect Content object lists and return (role, parts) tuples instead of flat content
Updated set_input_attributes() / set_input_attributes_sync() to handle the tuple format, assigning each Content object its own prompt index with the correct role

Before (broken)

{
  "gen_ai.prompt.0.user": "role: \"user\"\nparts {\n  text: \"What's 2+2?\"\n}\n\nrole: \"model\"\nparts {\n  text: \"5\"\n}\n..."
}

After (fixed)

{
  "gen_ai.prompt.0.role": "user",
  "gen_ai.prompt.0.content": "[{\"type\": \"text\", \"text\": \"What's 2+2?\"}]",
  "gen_ai.prompt.1.role": "model",
  "gen_ai.prompt.1.content": "[{\"type\": \"text\", \"text\": \"5\"}]",
  "gen_ai.prompt.2.role": "user",
  "gen_ai.prompt.2.content": "[{\"type\": \"text\", \"text\": \"really?\"}]"
}

Test Plan

Added 4 new regression tests in TestContentObjectMessageHistory:
- test_sync_content_objects_preserve_roles - verifies multi-turn conversation with user/model roles (exact reproduction of 🐛 Bug Report: Whole message history is stringified into a single 'user prompt' #2513)
- test_async_content_objects_preserve_roles - same test for the async path
- test_sync_content_object_with_multiple_parts - verifies Content with multiple Part objects
- test_sync_single_content_object_arg - verifies a single Content object passed directly
All 12 existing tests continue to pass (string args, mixed content, image content, disabled prompts, etc.)
Ruff lint passes

Summary by CodeRabbit

New Features
- Enhanced support for VertexAI Content objects with improved role handling and span attribute generation.
- Content objects with distinct roles are now properly tracked and preserved in instrumentation traces.
- Backward compatibility maintained for existing argument processing paths.

When generate_content() receives a list of Content objects with different roles (user, model, assistant), the _set_input_attributes function was treating the entire list as flat content and assigning everything under a single prompt index with role="user". This caused the entire conversation history to be merged into one string. Now Content objects are detected via their `role` and `parts` attributes and each is stored as a separate prompt with its own role and index, matching the expected telemetry output format. Fixes traceloop#2513

CLAassistant · 2026-03-04T20:12:31Z

All committers have signed the CLA.

coderabbitai · 2026-03-04T20:12:43Z

📝 Walkthrough

Walkthrough

This PR fixes a bug where lists of VertexAI Content objects with distinct roles were being concatenated into a single stringified prompt. The changes add helpers to detect and process Content objects, and update span attribute setting to preserve each Content object as a separately indexed prompt with its role.

Changes

Cohort / File(s)	Summary
Content Object Processing `packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py`	Introduces `_is_content_object()` to detect Content objects by checking for `role` and `parts` attributes. Adds `_process_content_object()` and `_process_content_object_sync()` helpers to extract role and process content parts. Extends `_process_vertexai_argument()` and sync variant to handle both single Content objects and lists of Content objects, returning (role, parts) tuples. Updates `set_input_attributes()` and `set_input_attributes_sync()` to iterate through Content objects with `prompt_index` sequencing, assigning per-prompt role and content attributes instead of stringifying the entire input.
Role Attribute Test Coverage `packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py`	Adds comprehensive test suite validating that Content objects with distinct roles are preserved as separate prompts. Tests cover synchronous and asynchronous execution paths, multiple content parts, single Content arguments (unwrapped), and image-like content handling. Tests verify correct role and content attribute assignment with proper indexing.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A tale of roles that once were lost,
Now stitched with care, no stringified cost!
Each message kept with index bright,
The prompts aligned, the attributes right.
Content flows through, both sync and fast,
No more collapsed into one big cast! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main fix: preserving individual message roles in Content object lists, which directly addresses the core problem identified in issue `#2513`.
Linked Issues check	✅ Passed	The PR implementation fully addresses the requirements from issue `#2513`: detects Content objects, preserves individual roles, separates messages into sequential prompts, and maintains backward compatibility.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with fixing issue `#2513`; only internal helpers and test coverage were added with no extraneous modifications.
Docstring Coverage	✅ Passed	Docstring coverage is 93.75% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (2)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py (1)

157-165: Consider validating all list elements, not just the first.

The check at line 159 only inspects the first element. If a mixed list like [Content(...), "text"] is passed, the loop at lines 162-164 would raise an AttributeError when accessing .role or .parts on the non-Content item.

While VertexAI's API expects homogeneous lists, adding a guard could improve robustness:

🛡️ Optional defensive validation

         # Check if the list contains Content objects (with role and parts)
         if argument and _is_content_object(argument[0]):
             # List of Content objects - each has its own role
             results = []
             for content_obj in argument:
+                if not _is_content_object(content_obj):
+                    logger.warning("Mixed list detected; falling back to raw processing")
+                    break
                 role, parts = await _process_content_object(content_obj, span)
                 results.append((role, parts))
-            return results
+            else:
+                return results
+            # Fall through to mixed content handling if break occurred

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py`
around lines 157 - 165, The branch handling lists currently only checks the
first element via _is_content_object(argument[0]) which can crash if later items
are non-Content; update the list handling in span_utils.py to validate every
element before processing: iterate the list and use _is_content_object on each
item (or use all(...) to assert homogeneity) and raise or fallback if any
element is not a Content-like object, then call _process_content_object only for
validated items (refer to the existing _is_content_object and
_process_content_object symbols to locate and modify the logic).

packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py (1)

305-350: Good edge case coverage; consider adding async variants.

These tests cover important edge cases (multiple parts, single Content argument) but only for the sync path. For completeness, consider adding async variants of test_sync_content_object_with_multiple_parts and test_sync_single_content_object_arg.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In
`@packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py`
around lines 305 - 350, Add async counterparts for the sync tests by creating
two new tests (e.g., test_async_content_object_with_multiple_parts and
test_async_single_content_object_arg) that mirror the existing sync versions:
patch should_send_prompts, build the same Content/part mocks, call and await
set_input_attributes_async(self.mock_span, args), then assert the same span
attributes and JSON-parsed content; ensure you decorate the tests for async
execution (pytest.mark.asyncio or the project’s async test pattern) and use the
same attribute keys (SpanAttributes.LLM_PROMPTS) and assertions as in the sync
tests.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py`:
- Around line 157-165: The branch handling lists currently only checks the first
element via _is_content_object(argument[0]) which can crash if later items are
non-Content; update the list handling in span_utils.py to validate every element
before processing: iterate the list and use _is_content_object on each item (or
use all(...) to assert homogeneity) and raise or fallback if any element is not
a Content-like object, then call _process_content_object only for validated
items (refer to the existing _is_content_object and _process_content_object
symbols to locate and modify the logic).

In
`@packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py`:
- Around line 305-350: Add async counterparts for the sync tests by creating two
new tests (e.g., test_async_content_object_with_multiple_parts and
test_async_single_content_object_arg) that mirror the existing sync versions:
patch should_send_prompts, build the same Content/part mocks, call and await
set_input_attributes_async(self.mock_span, args), then assert the same span
attributes and JSON-parsed content; ensure you decorate the tests for async
execution (pytest.mark.asyncio or the project’s async test pattern) and use the
same attribute keys (SpanAttributes.LLM_PROMPTS) and assertions as in the sync
tests.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 53c71351-9ee2-4f7d-a7ae-44fc5746ebb5

📥 Commits

Reviewing files that changed from the base of the PR and between 0b9c603 and 49a6875.

📒 Files selected for processing (2)

packages/opentelemetry-instrumentation-vertexai/opentelemetry/instrumentation/vertexai/span_utils.py
packages/opentelemetry-instrumentation-vertexai/tests/test_role_attributes.py

OiPunk · 2026-03-06T11:33:10Z

Closing — shifting focus to AI Agent core projects. Thank you for your time!

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

OiPunk closed this Mar 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(vertexai): preserve individual message roles in Content object lists#3761

fix(vertexai): preserve individual message roles in Content object lists#3761
OiPunk wants to merge 1 commit intotraceloop:mainfrom
OiPunk:codex/openllmetry-2513-vertexai-messages

OiPunk commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

CLAassistant commented Mar 4, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

OiPunk commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

OiPunk commented Mar 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Fix

Before (broken)

After (fixed)

Test Plan

Summary by CodeRabbit

Uh oh!

CLAassistant commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

OiPunk commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

OiPunk commented Mar 4, 2026 •

edited by coderabbitai bot

Loading

CLAassistant commented Mar 4, 2026 •

edited

Loading

coderabbitai bot commented Mar 4, 2026 •

edited

Loading